[LLDB][NativePDB] Allow type lookup in namespaces #149876

Nerixyz · 2025-07-21T19:13:45Z

Previously, type lookup for types in namespaces didn't work with the native PDB plugin, because FindTypes would only look for types whose base name was equal to their full name. PDB/CodeView does not store the base names in the TPI stream, but the types have their full name (e.g. std::thread instead of thread). So findRecordsByName would only return types in the top level namespace.

This PR changes the lookup to go through all types and check their base name. As that could be a bit expensive, the names are first cached (similar to the function lookup in the DIA PDB plugin). Potential types are checked with TypeQuery::ContextMatches.

To be able to handle anonymous namespaces, I changed TypeQuery::ContextMatches. The TypeQuery constructor inserts all name components as CompilerContextKind::AnyDeclContext. To skip over anonymous namespaces, ContextMatches checked if a component was empty and exactly of kind Namespace. For our query, the last check was always false, so we never skipped anonymous namespaces. DWARF doesn't have this problem, as it constructs the context outside and has proper information about namespaces. I'm not fully sure if my change is correct and that it doesn't break other users of TypeQuery.

This enables type lookup <type> to work on types in namespaces. However, expressions don't work with this yet, because FindNamespace is unimplemented for native PDB.

llvmbot · 2025-07-21T19:14:16Z

@llvm/pr-subscribers-lldb

Author: nerix (Nerixyz)

Changes

Previously, type lookup for types in namespaces didn't work with the native PDB plugin, because FindTypes would only look for types whose base name was equal to their full name. PDB/CodeView does not store the base names in the TPI stream, but the types have their full name (e.g. std::thread instead of thread). So findRecordsByName would only return types in the top level namespace.

This PR changes the lookup to go through all types and check their base name. As that could be a bit expensive, the names are first cached (similar to the function lookup in the DIA PDB plugin). Potential types are checked with TypeQuery::ContextMatches.

To be able to handle anonymous namespaces, I changed TypeQuery::ContextMatches. The TypeQuery constructor inserts all name components as CompilerContextKind::AnyDeclContext. To skip over anonymous namespaces, ContextMatches checked if a component was empty and exactly of kind Namespace. For our query, the last check was always false, so we never skipped anonymous namespaces. DWARF doesn't have this problem, as it constructs the context outside and has proper information about namespaces. I'm not fully sure if my change is correct and that it doesn't break other users of TypeQuery.

This enables type lookup <type> to work on types in namespaces. However, expressions don't work with this yet, because FindNamespace is unimplemented for native PDB.

Full diff: https://github.com/llvm/llvm-project/pull/149876.diff

5 Files Affected:

(modified) lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp (+54-4)
(modified) lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h (+4)
(modified) lldb/source/Symbol/Type.cpp (+6-2)
(added) lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit (+18)
(added) lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp (+115)

diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
index 20d8c1acf9c42..5141632649dd5 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
@@ -1630,6 +1630,53 @@ size_t SymbolFileNativePDB::ParseSymbolArrayInScope(
   return count;
 }
 
+void SymbolFileNativePDB::CacheTypeNames() {
+  if (!m_type_base_names.IsEmpty())
+    return;
+
+  LazyRandomTypeCollection &types = m_index->tpi().typeCollection();
+  for (auto ti = types.getFirst(); ti; ti = types.getNext(*ti)) {
+    CVType cvt = types.getType(*ti);
+    llvm::StringRef name;
+    // We are only interested in records, unions, and enums.
+    // We aren't interested in forward references as we'll visit the actual
+    // type later anyway.
+    switch (cvt.kind()) {
+    case LF_STRUCTURE:
+    case LF_CLASS: {
+      ClassRecord cr;
+      llvm::cantFail(TypeDeserializer::deserializeAs<ClassRecord>(cvt, cr));
+      if (cr.isForwardRef())
+        continue;
+      name = cr.Name;
+    } break;
+    case LF_UNION: {
+      UnionRecord ur;
+      llvm::cantFail(TypeDeserializer::deserializeAs<UnionRecord>(cvt, ur));
+      if (ur.isForwardRef())
+        continue;
+      name = ur.Name;
+    } break;
+    case LF_ENUM: {
+      EnumRecord er;
+      llvm::cantFail(TypeDeserializer::deserializeAs<EnumRecord>(cvt, er));
+      if (er.isForwardRef())
+        continue;
+      name = er.Name;
+    } break;
+    default:
+      continue;
+    }
+    if (name.empty())
+      continue;
+
+    auto base_name = MSVCUndecoratedNameParser::DropScope(name);
+    m_type_base_names.Append(ConstString(base_name), ti->getIndex());
+  }
+
+  m_type_base_names.Sort();
+}
+
 void SymbolFileNativePDB::DumpClangAST(Stream &s, llvm::StringRef filter) {
   auto ts_or_err = GetTypeSystemForLanguage(eLanguageTypeC_plus_plus);
   if (!ts_or_err)
@@ -1720,11 +1767,14 @@ void SymbolFileNativePDB::FindTypes(const lldb_private::TypeQuery &query,
 
   std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
 
-  std::vector<TypeIndex> matches =
-      m_index->tpi().findRecordsByName(query.GetTypeBasename().GetStringRef());
+  // We can't query for the basename or full name because the type might reside
+  // in an anonymous namespace. Cache the basenames first.
+  CacheTypeNames();
+  std::vector<uint32_t> matches;
+  m_type_base_names.GetValues(query.GetTypeBasename(), matches);
 
-  for (TypeIndex type_idx : matches) {
-    TypeSP type_sp = GetOrCreateType(type_idx);
+  for (uint32_t match_idx : matches) {
+    TypeSP type_sp = GetOrCreateType(TypeIndex(match_idx));
     if (!type_sp)
       continue;
 
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
index 9891313f11d0b..457b301c4a486 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.h
@@ -258,6 +258,8 @@ class SymbolFileNativePDB : public SymbolFileCommon {
 
   void ParseInlineSite(PdbCompilandSymId inline_site_id, Address func_addr);
 
+  void CacheTypeNames();
+
   llvm::BumpPtrAllocator m_allocator;
 
   lldb::addr_t m_obj_load_address = 0;
@@ -278,6 +280,8 @@ class SymbolFileNativePDB : public SymbolFileCommon {
   llvm::DenseMap<lldb::user_id_t, std::shared_ptr<InlineSite>> m_inline_sites;
   llvm::DenseMap<llvm::codeview::TypeIndex, llvm::codeview::TypeIndex>
       m_parent_types;
+
+  lldb_private::UniqueCStringMap<uint32_t> m_type_base_names;
 };
 
 } // namespace npdb
diff --git a/lldb/source/Symbol/Type.cpp b/lldb/source/Symbol/Type.cpp
index 0a886e56100a1..ddb22d611140b 100644
--- a/lldb/source/Symbol/Type.cpp
+++ b/lldb/source/Symbol/Type.cpp
@@ -134,7 +134,9 @@ bool TypeQuery::ContextMatches(
     if (ctx == ctx_end)
       return false; // Pattern too long.
 
-    if (ctx->kind == CompilerContextKind::Namespace && ctx->name.IsEmpty()) {
+    if ((ctx->kind & CompilerContextKind::Namespace) ==
+            CompilerContextKind::Namespace &&
+        ctx->name.IsEmpty()) {
       // We're matching an anonymous namespace. These are optional, so we check
       // if the pattern expects an anonymous namespace.
       if (pat->name.IsEmpty() && (pat->kind & CompilerContextKind::Namespace) ==
@@ -164,7 +166,9 @@ bool TypeQuery::ContextMatches(
   auto should_skip = [this](const CompilerContext &ctx) {
     if (ctx.kind == CompilerContextKind::Module)
       return GetIgnoreModules();
-    if (ctx.kind == CompilerContextKind::Namespace && ctx.name.IsEmpty())
+    if ((ctx.kind & CompilerContextKind::Namespace) ==
+            CompilerContextKind::Namespace &&
+        ctx.name.IsEmpty())
       return !GetStrictNamespaces();
     return false;
   };
diff --git a/lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit b/lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit
new file mode 100644
index 0000000000000..e61ed2e2f453e
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit
@@ -0,0 +1,18 @@
+b main
+r
+
+type lookup S
+type lookup ::S
+type lookup Outer::S
+type lookup Outer::Inner1::S
+type lookup Inner1::S
+type lookup Outer::Inner1::Inner2::S
+type lookup Inner2::S
+type lookup Outer::Inner2::S
+type lookup Outer::A
+type lookup A
+type lookup ::A
+expr sizeof(S)
+expr sizeof(A)
+
+quit
diff --git a/lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp b/lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp
new file mode 100644
index 0000000000000..8dbe062d8240f
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/NativePDB/namespace-access.cpp
@@ -0,0 +1,115 @@
+// clang-format off
+// REQUIRES: lld, x86
+
+// Test namespace lookup.
+// RUN: %clang_cl --target=x86_64-windows-msvc -Od -Z7 -GS- -c /Fo%t.obj -- %s
+// RUN: lld-link -debug:full -nodefaultlib -entry:main %t.obj -out:%t.exe -pdb:%t.pdb
+// RUN: %lldb -f %t.exe -s \
+// RUN:     %p/Inputs/namespace-access.lldbinit 2>&1 | FileCheck %s
+
+struct S {
+  char a[1];
+};
+
+namespace Outer {
+
+  struct S {
+    char a[2];
+  };
+
+  namespace Inner1 {
+    struct S {
+      char a[3];
+    };
+
+    namespace Inner2 {
+      struct S {
+        char a[4];
+      };
+    } // namespace Inner2
+  } // namespace Inner1
+
+  namespace Inner2 {
+    struct S {
+      char a[5];
+    };
+  } // namespace Inner2
+
+  namespace {
+    struct A {
+      char a[6];
+    };
+  } // namespace
+
+} // namespace Outer
+
+namespace {
+  struct A {
+    char a[7];
+  };
+} // namespace
+
+int main(int argc, char **argv) {
+  S s;
+  Outer::S os;
+  Outer::Inner1::S oi1s;
+  Outer::Inner1::Inner2::S oi1i2s;
+  Outer::Inner2::S oi2s;
+  A a1;
+  Outer::A a2;
+  return sizeof(s) + sizeof(os) + sizeof(oi1s) + sizeof(oi1i2s) + sizeof(oi2s) + sizeof(a1) + sizeof(a2);
+}
+
+
+
+// CHECK:      (lldb) type lookup S  
+// CHECK:      struct S {
+// CHECK:      struct S {
+// CHECK:      struct S {
+// CHECK:      struct S {
+// CHECK:      struct S {
+// CHECK:      }
+// CHECK-NEXT: (lldb) type lookup ::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[1];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[2];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::Inner1::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[3];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Inner1::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[3];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::Inner1::Inner2::S
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[4];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Inner2::S         
+// CHECK-NEXT: struct S {
+// CHECK:      struct S {
+// CHECK:      }
+// CHECK-NEXT: (lldb) type lookup Outer::Inner2::S        
+// CHECK-NEXT: struct S {
+// CHECK-NEXT:     char a[5];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup Outer::A        
+// CHECK-NEXT: struct A {
+// CHECK-NEXT:     char a[6];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) type lookup A       
+// CHECK-NEXT: struct A {
+// CHECK:      struct A {
+// CHECK:      }
+// CHECK-NEXT: (lldb) type lookup ::A
+// CHECK-NEXT: struct A {
+// CHECK-NEXT:     char a[7];
+// CHECK-NEXT: }
+// CHECK-NEXT: (lldb) expr sizeof(S) 
+// CHECK-NEXT: (__size_t) $0 = 1
+// CHECK-NEXT: (lldb) expr sizeof(A)
+// CHECK-NEXT: (__size_t) $1 = 7

ZequanWu · 2025-07-21T20:48:50Z

lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp

@@ -1630,6 +1630,53 @@ size_t SymbolFileNativePDB::ParseSymbolArrayInScope(
  return count;
 }

+void SymbolFileNativePDB::CacheTypeNames() {


Actually, SymbolFileNativePDB::BuildParentMap() already does the tpi stream iteration and it's called at NativePDB plugin initial setup. We could just cache the those base names there instead of iterating it the second time.

Michael137 · 2025-07-22T09:11:16Z

lldb/test/Shell/SymbolFile/NativePDB/Inputs/namespace-access.lldbinit

+expr sizeof(S)
+expr sizeof(A)
+
+quit


Lets put these commands into the test file itself. You can use split-file for this. For example:

llvm-project/lldb/test/Shell/Settings/TestFrameFormatFunctionScopeObjC.test

Line 4 in 81651e9

# RUN: split-file %s %t

Michael137 · 2025-07-23T10:46:24Z

lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp

@@ -2291,6 +2298,8 @@ void SymbolFileNativePDB::BuildParentMap() {
    TypeIndex fwd = full_to_forward[full];
    m_parent_types[fwd] = m_parent_types[full];
  }
+
+  m_type_base_names.Sort();


Why do we need this sort?

m_type_base_names is a UniqueCStringMap which holds a vector<(CString, T)>. To act as a map (i.e. to be able to use binary search), it needs to be sorted. Append(element) only translates to push_back. Once we inserted all elements, the map is unsorted, so we have to sort it to be able to look up elements.

lldb/source/Symbol/Type.cpp

Michael137 · 2025-07-23T11:30:25Z

lldb/source/Symbol/Type.cpp

@@ -134,7 +134,9 @@ bool TypeQuery::ContextMatches(
    if (ctx == ctx_end)
      return false; // Pattern too long.

-    if (ctx->kind == CompilerContextKind::Namespace && ctx->name.IsEmpty()) {
+    if ((ctx->kind & CompilerContextKind::Namespace) ==


@labath since you looked at this some time ago, do you see a concern with this? I guess the alternative is to make sure we don't construct the TypeQuery with a AnyDeclContext in PDB-land. @Nerixyz why can't we do that? Is there some PDB limitation?

why can't we do that? Is there some PDB limitation?

We could do that. It would probably replicate the TypeQuery constructor, though (which arguably isn't that large).

Yeah, so the current assumption is that when matching types, the pattern can contain wildcards (you can say "i'm looking for a type or a namespace at a given position"), but the thing you're matching is always precise (it will say "i'm a type" or "I'm a namespace", but it can't say "I don't know"). For DWARF this isn't an issue because we always have that information. I don't know if that's the case for PDB, or if it's just a side-effect of how it performs the query (*). If it's not possible to do it the other way, then we may have to change the way we do matching, but I think it'd be better to not do that.

(*) Currently, the PDB code kind of cheats by taking the type name and parsing it as a type query (SymbolFileNativePDB.cpp:1736 in this patch), but then using the result of that parsing as the thing to match. This is bad for two reasons:

if you're parsing from a name, it's clear that you won't be able to see whether enclosing scopes are a type or a namespace. The information just isn't there --Foo::Bar looks the same regardless of whether Foo is a namespace or not.

It's going to be slow because this approach requires you to fully materialize the type (and everything it depends on) before knowing if this is even the type you're looking for. This may not be as big of an issue for PDB as it is for DWARF because (so I hear) in doesn't have template information (which is the big source of fanout in this process) and it may have less of these searches internally (because it's already linked, in DWARF we need to do this just to search for a definition DIE), but it still means that a search for foo::iterator will materialize every iterator out there.

For this reason it would be better type context was created differently, without relying on a Type object. For DWARF this happens in DWARFDIE::GetTypeLookupContext, but I don't know if something like that works in PDB. The patch description seems to indicate that this information does not exist, but I don't understand how anything could work in that case. E.g. I don't know what kind of Clang AST would we created for these types. Are they properly nested in their expected namespace? If so how do you figure out that namespace?

That's a good point. I'll create the context in the PDB plugin. To avoid materializing all types that match the basename, I can see two approaches:

Naively use the type's name (PDB stores the demangled name there). This would essentially do the same as the TypeQuery constructor where it tries to separate the scopes. Every scope/context but the last one would be a namespace. For the last one, we can find the exact type.

PDB also contains a UniqueName field which can contain a mangled type name. Similar to CreateDeclInfoForType, the name could be demangled and each scope could be checked.

In both cases, it's possible to try and get the parent to handle nested structs/enums/unions like in CreateDeclInfoForType.

I don't know which approach is better, but I like the direction of this. I'll just say that if you find yourself needing to parse a type name, don't reimplement that code. I'd rather take the parsing code out of the TypeQuery constructor, and factor it into function that can be called independently.

How expensive is this "get the parent to handle nested structs/enums/unions" operation? I'm asking because I'm wondering if we can just do it all the time, or if we should try to avoid it. If it's cheap, we can just construct the precise context (one where every scope knows whether it's a type or a namespace) every time. The advantage of that is that we can reuse the existing infrastructure and the overall flow will be very similar to how things work in DWARF. However, if it's expensive, then we may want to do something like you've done here, where we first do a "fuzzy" match (just checking whether the names match, not looking at the kinds), and then we confirm the match (if needed) by looking at the kinds.

How expensive is this "get the parent to handle nested structs/enums/unions" operation?

This is only a lookup in a map that we created. But this only works for tag types, not for namespaces. My approach now uses Type::GetTypeScopeAndBasename to split the undecorated name. At first, it assumes everything is a namespace and then walks backwards to find parent types.

I'm not sure what the best way to test this is.

Nerixyz requested a review from JDevlieghere as a code owner July 21, 2025 19:13

llvmbot added the lldb label Jul 21, 2025

[LLDB][NativePDB] Allow type lookup in namespaces

fa3c96b

Nerixyz force-pushed the fix/lldb-npdb-type-anon-ns branch from 95e8fab to fa3c96b Compare July 21, 2025 19:28

ZequanWu reviewed Jul 21, 2025

View reviewed changes

Michael137 reviewed Jul 22, 2025

View reviewed changes

refactor: move basename discovery to BuildParentMap

288b5f7

Michael137 reviewed Jul 23, 2025

View reviewed changes

lldb/source/Symbol/Type.cpp Outdated Show resolved Hide resolved

Michael137 reviewed Jul 23, 2025

View reviewed changes

Nerixyz force-pushed the fix/lldb-npdb-type-anon-ns branch from 1d99fe1 to 38e1ac7 Compare July 28, 2025 10:15

Nerixyz added 2 commits July 28, 2025 12:19

refactor: convert test to use split-file

1dbde10

refactor: create context in symbol file plugin

8344a98

Nerixyz force-pushed the fix/lldb-npdb-type-anon-ns branch from 38e1ac7 to 8344a98 Compare July 28, 2025 10:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LLDB][NativePDB] Allow type lookup in namespaces #149876

[LLDB][NativePDB] Allow type lookup in namespaces #149876

Nerixyz commented Jul 21, 2025

Uh oh!

llvmbot commented Jul 21, 2025

Uh oh!

ZequanWu Jul 21, 2025

Uh oh!

Michael137 Jul 22, 2025

Uh oh!

Michael137 Jul 23, 2025

Uh oh!

Nerixyz Jul 23, 2025

Uh oh!

Uh oh!

Michael137 Jul 23, 2025

Uh oh!

Nerixyz Jul 23, 2025

Uh oh!

labath Jul 25, 2025

Uh oh!

Nerixyz Jul 25, 2025

Uh oh!

labath Jul 28, 2025

Uh oh!

Nerixyz Jul 28, 2025

Uh oh!

Uh oh!

[LLDB][NativePDB] Allow type lookup in namespaces #149876

Are you sure you want to change the base?

[LLDB][NativePDB] Allow type lookup in namespaces #149876

Conversation

Nerixyz commented Jul 21, 2025

Uh oh!

llvmbot commented Jul 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!