feat(OpenGraph): Generate AD and AZ extension schemas and apply on startup - BED-6721 #2201

wes-mil · 2025-12-22T19:53:26Z

Description

Creates SQL files to populate AD and AZ extension schemas

Motivation and Context

Why is this change required? What problem does it solve?

All schema definitions are being moved over to postgres. This incremental change moves the AD and AZ schemas over to postgres without removing existing functionality.

How Has This Been Tested?

Code is generated running just generate.

Data population is run on app startup.

Screenshots (optional):

Types of changes

New feature (non-breaking change which adds functionality)
Database Migrations (sorta)

Checklist:

I have met the contributing prerequisites
- Assigned myself to this PR
- Added the appropriate labels
- Associated an issue: Issues and Pull Requests README #672
- Read the Contributing guide: https://github.com/SpecterOps/BloodHound/wiki/Contributing
I have ensured that related documentation is up-to-date
- Open API docs
- Code comments (GoDocs / JSDocs)
I have followed proper test practices
- Added/updated tests to cover my changes
- All new and existing tests passed

Summary by CodeRabbit

New Features
- Added automatic extension data population during system initialization.
- Introduced schema definitions for Active Directory and Azure extensions, including node and relationship types.
Chores
- Enhanced structured logging for error reporting throughout the migration and initialization processes.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-22T19:53:45Z

Walkthrough

This pull request introduces extension data population as a new initialization step in the database setup flow. It adds the PopulateExtensionData function across bootstrap, database, and migration layers, implements SQL generation for Active Directory and Azure extensions, and integrates the step into service entrypoints and test fixtures to populate extension metadata before graph migrations.

Changes

Cohort / File(s)	Summary
Bootstrap Wrapper `cmd/api/src/bootstrap/server.go`	Added new public function `PopulateExtensionData` that wraps database extension data population.
Database Interface & Implementation `cmd/api/src/database/db.go`, `cmd/api/src/database/mocks/db.go`	Added `PopulateExtensionData(ctx context.Context) error` to Database interface and BloodhoundDB implementation; invokes migrator `ExecuteExtensionDataPopulation`. Updated mocks accordingly.
Migration System `cmd/api/src/database/migration/migration.go`, `cmd/api/src/database/migration/stepwise.go`	Embedded extension SQL files via `//go:embed extensions`, added `ExtensionsData` field to Migrator, and implemented `ExecuteExtensionDataPopulation()` method to read and execute .sql files from extensions directory.
Extension Schema Definitions `cmd/api/src/database/migration/extensions/ad.sql`, `cmd/api/src/database/migration/extensions/az.sql`	PostgreSQL migration scripts defining Active Directory and Azure extensions with node kinds and edge kinds schema metadata.
Schema Generation `packages/go/schemagen/generator/sql.go`, `packages/go/schemagen/main.go`	Added `NodeIcon` type, `NodeIcons` map for UI metadata, and SQL generation functions (`GenerateExtensionSQLActiveDirectory`, `GenerateExtensionSQLAzure`) to produce extension SQL files. Integrated `GenerateSQL` into main workflow.
Logging Enhancement `packages/go/schemagen/generator/cue.go`	Replaced fmt.Sprintf debug log with structured slog.Debug call.
Service Integration `cmd/api/src/services/entrypoint.go`, `packages/go/graphify/graph/graph.go`, `cmd/api/src/services/graphify/graphify_integration_test.go`, `cmd/api/src/test/integration/database.go`, `cmd/api/src/test/lab/fixtures/postgres.go`	Integrated `PopulateExtensionData` call into initialization flows after `MigrateDB` and before graph migrations across multiple service entrypoints and test fixtures.
Integration Tests `cmd/api/src/daemons/changelog/ingestion_integration_test.go`, `cmd/api/src/daemons/datapipe/datapipe_integration_test.go`, `cmd/api/src/database/database_integration_test.go`	Added `PopulateExtensionData` invocation to test setup routines after database migration.

Sequence Diagram

sequenceDiagram
    participant Svc as Service/Entrypoint
    participant Boot as Bootstrap Layer
    participant DB as Database
    participant Mig as Migrator
    participant FS as File System
    
    Svc->>Boot: MigrateDB(ctx, db)
    Boot->>DB: Migrate(ctx)
    DB->>Mig: Migrate(ctx)
    Mig->>FS: Read & execute migration files
    FS-->>Mig: SQL executed
    Mig-->>DB: ✓ Success
    DB-->>Boot: ✓ Success
    
    Svc->>Boot: PopulateExtensionData(ctx, db)
    Boot->>DB: PopulateExtensionData(ctx)
    DB->>Mig: ExecuteExtensionDataPopulation()
    Mig->>FS: ReadDir(extensions/)
    FS-->>Mig: [ad.sql, az.sql]
    Mig->>FS: ReadFile(ad.sql)
    FS-->>Mig: SQL content
    Mig->>DB: Execute AD extension SQL (transaction)
    DB-->>Mig: ✓ Inserted extension metadata
    Mig->>FS: ReadFile(az.sql)
    FS-->>Mig: SQL content
    Mig->>DB: Execute AZ extension SQL (transaction)
    DB-->>Mig: ✓ Inserted extension metadata
    Mig-->>DB: ✓ Success
    DB-->>Svc: ✓ Success
    
    Svc->>Svc: Proceed to graph migrations

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Key areas requiring attention:
- Schema generation logic in packages/go/schemagen/generator/sql.go — verify NodeIcons mapping completeness and SQL string assembly correctness for node/edge kinds
- Extension SQL files (ad.sql, az.sql) — ensure all node and edge kind definitions are correct and match schema expectations
- Error handling consistency in ExecuteExtensionDataPopulation() — verify transactional integrity and error propagation across multiple .sql file executions
- Integration points across services — confirm PopulateExtensionData is called in the correct order relative to migrations and graph initialization in all code paths

Possibly related PRs

BED-6777: OG DB - Create/Read Extension Entry #2065 — Extends schema_extensions feature by adding DB-level handlers and the schema_extensions table that this PR populates with extension data.
feat(OpenGraph): Add migration and DB methods (create, get) for Graph Schema Properties - BED-6783 #2074 — Adds schema_extensions/schema_properties migration and DB methods for graph schema properties that work alongside the extension population flow.
feat(OpenGraph): DB - Create/Read Extension Edge Kind Schema Entry - BED-6790 #2107 — Introduces the schema_edge_kinds table and DB methods required for the edge kind entries inserted by extension population.

Suggested labels

enhancement, dbmigration

Suggested reviewers

LawsonWillard
AD7ZJ
superlinkx

Poem

🐰 Hop, hop! Extensions bloom,
Data populates every room,
AD and Azure schemas take flight,
Migrations dance through the night!
A fluffy PR, pure delight! 🌸

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: generating AD and AZ extension schemas and applying them at startup, with direct reference to the Jira ticket.
Description check	✅ Passed	The description addresses all required template sections: description of changes, motivation/context with ticket reference, testing details, types of changes, and completed checklist items.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch BED-6721

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (3)

cmd/api/src/bootstrap/server.go (1)

75-81: Function implementation is correct, but can be simplified.

The function correctly wraps the database's PopulateExtensionData method. However, the explicit return nil on line 80 is redundant since the error is already nil when reaching that point.
🔎 Optional simplification
 func PopulateExtensionData(ctx context.Context, db database.Database) error {
-	if err := db.PopulateExtensionData(ctx); err != nil {
-		return err
-	}
-
-	return nil
+	return db.PopulateExtensionData(ctx)
 }

cmd/api/src/test/lab/fixtures/postgres.go (1)

44-45: Extension data population correctly added, but consider reusing DB instance.

The extension data population step is properly integrated with correct error handling. However, this fixture creates multiple BloodhoundDB instances (lines 40, 42, 44, and 47). While this works correctly, consider creating a single instance at the beginning and reusing it throughout the setup chain for better efficiency.

🔎 Optional refactor to reduce instance creation

 var PostgresFixture = lab.NewFixture(func(harness *lab.Harness) (*database.BloodhoundDB, error) {
 	testCtx := context.Background()
 	if labConfig, ok := lab.Unpack(harness, ConfigFixture); !ok {
 		return nil, fmt.Errorf("unable to unpack ConfigFixture")
 	} else if pgdb, err := database.OpenDatabase(labConfig.Database.PostgreSQLConnectionString()); err != nil {
 		return nil, err
-	} else if err := integration.Prepare(testCtx, database.NewBloodhoundDB(pgdb, auth.NewIdentityResolver())); err != nil {
+	} else {
+		bhdb := database.NewBloodhoundDB(pgdb, auth.NewIdentityResolver())
+		if err := integration.Prepare(testCtx, bhdb); err != nil {
-		return nil, fmt.Errorf("failed ensuring database: %v", err)
+			return nil, fmt.Errorf("failed ensuring database: %v", err)
-	} else if err := bootstrap.MigrateDB(testCtx, labConfig, database.NewBloodhoundDB(pgdb, auth.NewIdentityResolver()), config.NewDefaultAdminConfiguration); err != nil {
+		} else if err := bootstrap.MigrateDB(testCtx, labConfig, bhdb, config.NewDefaultAdminConfiguration); err != nil {
-		return nil, fmt.Errorf("failed migrating database: %v", err)
+			return nil, fmt.Errorf("failed migrating database: %v", err)
-	} else if err := bootstrap.PopulateExtensionData(testCtx, database.NewBloodhoundDB(pgdb, auth.NewIdentityResolver())); err != nil {
+		} else if err := bootstrap.PopulateExtensionData(testCtx, bhdb); err != nil {
-		return nil, fmt.Errorf("failed populating extension data: %v", err)
+			return nil, fmt.Errorf("failed populating extension data: %v", err)
-	} else {
+		}
-		return database.NewBloodhoundDB(pgdb, auth.NewIdentityResolver()), nil
+		return bhdb, nil
-	}
+	}
 }, nil)

packages/go/schemagen/generator/sql.go (1)

212-229: Consider adding error context for debugging.

The filesystem operations (stat, mkdir, open, write) return errors without additional context. While the current error handling is functionally correct, wrapping errors with context would aid debugging when generation fails.

🔎 Example using error wrapping

 if _, err := os.Stat(dir); err != nil {
     if !os.IsNotExist(err) {
-        return err
+        return fmt.Errorf("failed to stat directory %s: %w", dir, err)
     }

     if err := os.MkdirAll(dir, defaultPackageDirPermission); err != nil {
-        return err
+        return fmt.Errorf("failed to create directory %s: %w", dir, err)
     }
 }

 if fout, err := os.OpenFile(path.Join(dir, "ad.sql"), fileOpenMode, defaultSourceFilePermission); err != nil {
-    return err
+    return fmt.Errorf("failed to open ad.sql: %w", err)
 } else {
     defer fout.Close()

     _, err := fout.WriteString(sb.String())
-    return err
+    if err != nil {
+        return fmt.Errorf("failed to write ad.sql: %w", err)
+    }
+    return nil
 }

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e2e3a8 and 7c6b164.

📒 Files selected for processing (18)

cmd/api/src/bootstrap/server.go
cmd/api/src/daemons/changelog/ingestion_integration_test.go
cmd/api/src/daemons/datapipe/datapipe_integration_test.go
cmd/api/src/database/database_integration_test.go
cmd/api/src/database/db.go
cmd/api/src/database/migration/extensions/ad.sql
cmd/api/src/database/migration/extensions/az.sql
cmd/api/src/database/migration/migration.go
cmd/api/src/database/migration/stepwise.go
cmd/api/src/database/mocks/db.go
cmd/api/src/services/entrypoint.go
cmd/api/src/services/graphify/graphify_integration_test.go
cmd/api/src/test/integration/database.go
cmd/api/src/test/lab/fixtures/postgres.go
packages/go/graphify/graph/graph.go
packages/go/schemagen/generator/cue.go
packages/go/schemagen/generator/sql.go
packages/go/schemagen/main.go

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: 2025-06-06T23:12:14.181Z

Learnt from: elikmiller
Repo: SpecterOps/BloodHound PR: 1563
File: packages/go/graphschema/azure/azure.go:24-24
Timestamp: 2025-06-06T23:12:14.181Z
Learning: In BloodHound, files in packages/go/graphschema/*/`*.go` are generated from CUE schemas. When `just prepare-for-codereview` is run, it triggers code generation that may automatically add import aliases or other formatting changes. These changes are legitimate outputs of the generation process, not manual edits that would be overwritten.

Applied to files:

packages/go/schemagen/main.go
packages/go/schemagen/generator/cue.go
packages/go/schemagen/generator/sql.go

📚 Learning: 2025-06-25T17:52:33.291Z

Learnt from: superlinkx
Repo: SpecterOps/BloodHound PR: 1606
File: cmd/api/src/analysis/azure/post.go:33-35
Timestamp: 2025-06-25T17:52:33.291Z
Learning: In BloodHound Go code, prefer using explicit slog type functions like slog.Any(), slog.String(), slog.Int(), etc. over simple key-value pairs for structured logging. This provides better type safety and makes key-value pairs more visually distinct. For error types, use slog.Any("key", err) or slog.String("key", err.Error()).

Applied to files:

packages/go/schemagen/main.go
packages/go/schemagen/generator/cue.go

📚 Learning: 2025-11-25T22:11:53.518Z

Learnt from: LawsonWillard
Repo: SpecterOps/BloodHound PR: 2107
File: cmd/api/src/database/graphschema.go:86-100
Timestamp: 2025-11-25T22:11:53.518Z
Learning: In cmd/api/src/database/graphschema.go, the CreateSchemaEdgeKind method intentionally does not use AuditableTransaction or audit logging because it would create too much noise in the audit log, unlike CreateGraphSchemaExtension which does use auditing.

Applied to files:

packages/go/schemagen/generator/sql.go

🧬 Code graph analysis (12)

cmd/api/src/bootstrap/server.go (1)

cmd/api/src/database/db.go (1)

Database (72-192)

packages/go/graphify/graph/graph.go (1)

cmd/api/src/bootstrap/server.go (1)

PopulateExtensionData (75-81)

cmd/api/src/daemons/datapipe/datapipe_integration_test.go (1)

cmd/api/src/bootstrap/server.go (1)

PopulateExtensionData (75-81)

cmd/api/src/database/migration/stepwise.go (1)

cmd/api/src/database/migration/migration.go (1)

Migrator (47-51)

cmd/api/src/services/entrypoint.go (1)

cmd/api/src/bootstrap/server.go (1)

PopulateExtensionData (75-81)

cmd/api/src/test/lab/fixtures/postgres.go (3)

cmd/api/src/bootstrap/server.go (1)

PopulateExtensionData (75-81)

cmd/api/src/database/db.go (1)

NewBloodhoundDB (225-227)

cmd/api/src/auth/model.go (1)

NewIdentityResolver (74-76)

cmd/api/src/database/db.go (2)

cmd/api/src/bootstrap/server.go (1)

PopulateExtensionData (75-81)

cmd/api/src/database/migration/migration.go (1)

NewMigrator (54-64)

cmd/api/src/services/graphify/graphify_integration_test.go (1)

cmd/api/src/bootstrap/server.go (1)

PopulateExtensionData (75-81)

cmd/api/src/test/integration/database.go (1)

cmd/api/src/bootstrap/server.go (1)

PopulateExtensionData (75-81)

cmd/api/src/database/mocks/db.go (1)

cmd/api/src/bootstrap/server.go (1)

PopulateExtensionData (75-81)

cmd/api/src/database/database_integration_test.go (1)

cmd/api/src/bootstrap/server.go (1)

PopulateExtensionData (75-81)

packages/go/schemagen/generator/sql.go (3)

packages/go/schemagen/model/schema.go (2)

ActiveDirectory (62-72)

Azure (50-60)

packages/go/schemagen/generator/golang.go (1)

SchemaSourceName (32-32)

packages/go/schemagen/csgen/models.go (1)

Symbol (23-23)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Build BloodHound Container Image / Build and Package Container
GitHub Check: run-tests
GitHub Check: build-ui
GitHub Check: run-analysis

🔇 Additional comments (21)

packages/go/schemagen/generator/cue.go (1)

98-101: LGTM! Structured logging adopted.

The change from unstructured logging to structured slog.Debug with explicit type functions improves observability and aligns with the project's logging standards.

Based on learnings, BloodHound prefers explicit slog type functions like slog.String() for better type safety and visual distinction.

cmd/api/src/services/graphify/graphify_integration_test.go (1)

88-89: LGTM: Extension data population properly integrated into test setup.

The extension data population step is correctly positioned after database migration and before graph schema assertion, with appropriate error handling.

cmd/api/src/daemons/datapipe/datapipe_integration_test.go (1)

93-94: LGTM: Consistent test setup pattern.

The extension data population is correctly integrated with proper error handling and sequencing.

cmd/api/src/database/database_integration_test.go (1)

61-62: LGTM: Extension data population added to database integration tests.

The new initialization step is properly integrated with appropriate error handling.

cmd/api/src/test/integration/database.go (1)

139-140: LGTM: Extension data population integrated into Prepare flow.

The new step is properly sequenced and includes appropriate error wrapping. The deprecation notice for this file doesn't affect the correctness of this change.

cmd/api/src/daemons/changelog/ingestion_integration_test.go (1)

119-119: LGTM: Extension data population added to changelog integration test.

The initialization step is correctly positioned with proper error handling.

packages/go/graphify/graph/graph.go (1)

178-179: LGTM: Extension data population integrated into service initialization.

The extension data population is correctly sequenced between database migration and graph migration, with appropriate error handling and propagation.

cmd/api/src/database/mocks/db.go (1)

2413-2425: LGTM! Generated mock aligns with interface changes.

The gomock-generated PopulateExtensionData method correctly implements the new Database interface method. The mock follows the established pattern and will support testing scenarios where extension data population is invoked.

cmd/api/src/services/entrypoint.go (1)

83-84: LGTM! Extension data population correctly sequenced.

The new PopulateExtensionData step is properly placed after RDMS migrations and before graph migrations, with appropriate error handling and descriptive error messages.

packages/go/schemagen/main.go (2)

76-86: LGTM! SQL generation function follows established patterns.

The GenerateSQL function mirrors the structure of GenerateGolang, GenerateSharedTypeScript, and GenerateCSharp, providing a consistent interface for generating extension SQL files.

92-93: Good use of structured logging.

The migration to slog with attr.Error and slog.String improves observability and follows Go structured logging best practices.

Based on learnings, this aligns with the preferred pattern in BloodHound for structured logging.

Also applies to: 98-99, 105-105

cmd/api/src/database/migration/stepwise.go (1)

199-239: LGTM! Extension data population is well-structured.

The ExecuteExtensionDataPopulation method properly:

Iterates through extension data sources

Filters for SQL files

Executes each file in a transaction

Provides clear error messages with file context

The SQL files are idempotent (DELETE before INSERT), so re-execution is safe if this method is called multiple times.

cmd/api/src/database/migration/migration.go (1)

30-31: LGTM! Clean separation of extension data from migrations.

The new ExtensionMigrations embed and ExtensionsData field provide a clear separation between schema migrations and extension data population, improving maintainability.

Also applies to: 49-49, 59-61

cmd/api/src/database/db.go (2)

101-101: LGTM! Interface extension is focused and well-defined.

The PopulateExtensionData method addition to the Database interface provides a clear contract for extension data initialization.

278-285: LGTM! Implementation properly delegates and logs errors.

The PopulateExtensionData implementation:

Delegates to the migrator's ExecuteExtensionDataPopulation

Uses structured logging with attr.Error for clear diagnostics

Provides descriptive error messages for the extensions data population phase

cmd/api/src/database/migration/extensions/ad.sql (2)

24-130: LGTM! Generated SQL follows correct pattern.

The DO block properly:

Captures the new extension id with RETURNING ... INTO

Uses the captured id for all subsequent inserts

Provides idempotency through initial DELETE

18-18: Foreign key constraints are properly configured with CASCADE deletes.

The schema_node_kinds, schema_edge_kinds, schema_properties, schema_environments, and schema_relationship_findings tables all reference schema_extensions(id) with ON DELETE CASCADE, so the DELETE operation will safely cascade without constraint violations.

cmd/api/src/database/migration/extensions/az.sql (2)

18-18: Verify foreign key constraints allow deletion.

Same as ad.sql: ensure ON DELETE CASCADE is configured for foreign keys referencing schema_extensions to prevent deletion failures when re-running this script.

24-97: LGTM! Generated SQL follows correct pattern.

The Azure extension SQL properly captures the new extension id and uses it for all node and edge kind inserts, maintaining referential integrity.

packages/go/schemagen/generator/sql.go (2)

27-163: LGTM!

The NodeIcon struct and NodeIcons map provide a clean way to associate UI metadata with schema node types. The hardcoded icon and color mappings are appropriate for built-in AD and Azure node types.

180-180: This is static SQL file generation, not a security vulnerability.

The code generates SQL INSERT statements and writes them to a .sql file. Values come from CUE schemas and a hardcoded NodeIcons map—both developer-controlled sources. Since the SQL is written to a static file rather than executed with user input, there is no SQL injection vector here.

If you want to add defensive escaping for robustness (in case schema definitions ever contain special characters), that's reasonable as a code quality improvement, but this should not be treated as a security issue.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-12-22T19:59:48Z

packages/go/schemagen/generator/sql.go

+	for i, kind := range adSchema.NodeKinds {
+		if iconInfo, found := NodeIcons[kind.Symbol]; found {
+			sb.WriteString(fmt.Sprintf("\t\t(new_extension_id, '%s', '%s', '', %t, '%s', '%s')", kind.GetRepresentation(), kind.GetName(), found, iconInfo.Icon, iconInfo.Color))
+		} else {
+			sb.WriteString(fmt.Sprintf("\t\t(new_extension_id, '%s', '%s', '', %t, '', '')", kind.GetRepresentation(), kind.GetName(), found))
+		}
+
+		if i != len(adSchema.NodeKinds)-1 {
+			sb.WriteString(",\n")
+		}
+	}


⚠️ Potential issue | 🟠 Major

Semantic mismatch: found controls is_display_kind.

The boolean found (indicating whether a NodeIcon entry exists) is used directly as the is_display_kind value in the SQL. This creates a semantic coupling where node types without icons are marked as non-displayable. If is_display_kind should reflect UI display policy rather than icon availability, this is incorrect. If a new node type is added to the schema without a corresponding icon, it would incorrectly be marked is_display_kind=false.

Verify the intended semantics of is_display_kind in the database schema:

#!/bin/bash # Search for is_display_kind usage and schema definitions rg -n "is_display_kind" --type=go --type=sql -C3

coderabbitai · 2025-12-22T19:59:48Z

packages/go/schemagen/generator/sql.go

+	for i, kind := range adSchema.RelationshipKinds {
+		_, traversable := traversableMap[kind.Symbol]
+
+		sb.WriteString(fmt.Sprintf("\t\t(new_extension_id, '%s', '', %t)", kind.GetRepresentation(), traversable))


⚠️ Potential issue | 🟠 Major

SQL injection risk from unescaped string interpolation.

Line 203 has the same SQL injection risk as identified in the node kinds section (lines 180, 182). kind.GetRepresentation() is interpolated without escaping.

🤖 Prompt for AI Agents

In packages/go/schemagen/generator/sql.go around line 203, the code interpolates kind.GetRepresentation() directly into an SQL string causing SQL injection risk; escape single quotes in the representation before injection (e.g., replace ' with ''), or better yet build these inserts using parameterized statements/driver-specific escaping; update the code to sanitize/escape kind.GetRepresentation() (or switch to parameters) before calling fmt.Sprintf so generated SQL cannot be broken by embedded quotes.

coderabbitai · 2025-12-22T19:59:48Z

packages/go/schemagen/generator/sql.go

+func GenerateExtensionSQLAzure(dir string, azSchema model.Azure) error {
+	var sb strings.Builder
+
+	sb.WriteString(fmt.Sprintf("-- Code generated by Cuelang code gen. DO NOT EDIT!\n-- Cuelang source: %s/\n", SchemaSourceName))
+
+	sb.WriteString("DELETE FROM schema_extensions WHERE name = 'AZ';\n\n")
+
+	sb.WriteString("DO $$\nDECLARE\n\tnew_extension_id INT;\nBEGIN\n")
+
+	sb.WriteString("\tINSERT INTO schema_extensions (name, display_name, version, is_builtin) VALUES ('AZ', 'Azure', 'v0.0.1', true) RETURNING id INTO new_extension_id;\n\n")
+
+	sb.WriteString("\tINSERT INTO schema_node_kinds (schema_extension_id, name, display_name, description, is_display_kind, icon, icon_color) VALUES\n")
+
+	for i, kind := range azSchema.NodeKinds {
+		if iconInfo, found := NodeIcons[kind.Symbol]; found {
+			sb.WriteString(fmt.Sprintf("\t\t(new_extension_id, '%s', '%s', '', %t, '%s', '%s')", kind.GetRepresentation(), kind.GetName(), found, iconInfo.Icon, iconInfo.Color))
+		} else {
+			sb.WriteString(fmt.Sprintf("\t\t(new_extension_id, '%s', '%s', '', %t, '', '')", kind.GetRepresentation(), kind.GetName(), found))
+		}
+
+		if i != len(azSchema.NodeKinds)-1 {
+			sb.WriteString(",\n")
+		}
+	}
+
+	sb.WriteString(";\n\n")
+
+	sb.WriteString("\tINSERT INTO schema_edge_kinds (schema_extension_id, name, description, is_traversable) VALUES\n")
+
+	traversableMap := make(map[string]struct{})
+
+	for _, kind := range azSchema.PathfindingRelationships {
+		traversableMap[kind.Symbol] = struct{}{}
+	}
+
+	for i, kind := range azSchema.RelationshipKinds {
+		_, traversable := traversableMap[kind.Symbol]
+
+		sb.WriteString(fmt.Sprintf("\t\t(new_extension_id, '%s', '', %t)", kind.GetRepresentation(), traversable))
+
+		if i != len(azSchema.RelationshipKinds)-1 {
+			sb.WriteString(",\n")
+		}
+	}
+
+	sb.WriteString(";\nEND $$;")
+
+	if _, err := os.Stat(dir); err != nil {
+		if !os.IsNotExist(err) {
+			return err
+		}
+
+		if err := os.MkdirAll(dir, defaultPackageDirPermission); err != nil {
+			return err
+		}
+	}
+
+	if fout, err := os.OpenFile(path.Join(dir, "az.sql"), fileOpenMode, defaultSourceFilePermission); err != nil {
+		return err
+	} else {
+		defer fout.Close()
+
+		_, err := fout.WriteString(sb.String())
+		return err
+	}
+}


🛠️ Refactor suggestion | 🟠 Major

Extract common SQL generation logic to eliminate duplication.

GenerateExtensionSQLAzure is nearly identical to GenerateExtensionSQLActiveDirectory. The functions differ only in extension metadata (name, display name, file name) and schema type. This duplication violates DRY and creates maintenance risk—future changes must be applied consistently to both functions.

Additionally, this function inherits the same issues identified in the AD function:

Lines 247, 249: found controls is_display_kind (semantic mismatch)

Lines 247, 249, 270: SQL injection risk from unescaped interpolation

🔎 Proposed refactor to eliminate duplication

Create a generic helper that accepts schema interface and extension metadata:

type ExtensionMetadata struct { Name string DisplayName string Version string FileName string } type SchemaProvider interface { GetNodeKinds() []StringEnum GetRelationshipKinds() []StringEnum GetPathfindingRelationships() []StringEnum } func generateExtensionSQL(dir string, metadata ExtensionMetadata, schema SchemaProvider) error { var sb strings.Builder sb.WriteString(fmt.Sprintf("-- Code generated by Cuelang code gen. DO NOT EDIT!\n-- Cuelang source: %s/\n", SchemaSourceName)) sb.WriteString(fmt.Sprintf("DELETE FROM schema_extensions WHERE name = '%s';\n\n", metadata.Name)) // ... rest of common logic using schema.GetNodeKinds(), etc. if fout, err := os.OpenFile(path.Join(dir, metadata.FileName), fileOpenMode, defaultSourceFilePermission); err != nil { return err } else { defer fout.Close() _, err := fout.WriteString(sb.String()) return err } } func GenerateExtensionSQLActiveDirectory(dir string, adSchema model.ActiveDirectory) error { return generateExtensionSQL(dir, ExtensionMetadata{ Name: "AD", DisplayName: "Active Directory", Version: "v0.0.1", FileName: "ad.sql", }, adSchemaAdapter{adSchema}) } func GenerateExtensionSQLAzure(dir string, azSchema model.Azure) error { return generateExtensionSQL(dir, ExtensionMetadata{ Name: "AZ", DisplayName: "Azure", Version: "v0.0.1", FileName: "az.sql", }, azSchemaAdapter{azSchema}) }

Then implement adapter types to satisfy the SchemaProvider interface.

brandonshearin · 2025-12-23T17:39:55Z

cmd/api/src/database/migration/extensions/ad.sql

@@ -0,0 +1,131 @@
+-- Copyright 2025 Specter Ops, Inc.


consider renaming ad and az.sql files to something more descriptive like ad_graph_schema.sql or ad_kinds.sql

brandonshearin · 2025-12-23T17:45:59Z

packages/go/schemagen/generator/sql.go

+	"github.com/specterops/bloodhound/packages/go/schemagen/model"
+)
+
+type NodeIcon struct {


NodeIcon and NodeIcons likely dont need exporting.

brandonshearin · 2025-12-23T17:46:46Z

packages/go/schemagen/generator/sql.go

+	},
+}
+
+func GenerateExtensionSQLActiveDirectory(dir string, adSchema model.ActiveDirectory) error {


agree with coderabbit here-- there is an opportunity to dry up this function and the corresponding azure generator.

brandonshearin · 2025-12-23T17:52:58Z

packages/go/schemagen/generator/sql.go

+	},
+}
+
+func GenerateExtensionSQLActiveDirectory(dir string, adSchema model.ActiveDirectory) error {


some simple unit testing may be helpful with maintaining this new stuff too. you can use the TempDir from testing.T to do something like:

// Setup test data adSchema := model.ActiveDirectory{ NodeKinds: []model.NodeKind{ {Symbol: "User", Name: "User"}, {Symbol: "Computer", Name: "Computer"}, {Symbol: "Group", Name: "Group"}, }, RelationshipKinds: []model.RelationshipKind{ {Symbol: "MemberOf"}, {Symbol: "AdminTo"}, }, PathfindingRelationships: []model.PathfindingRelationship{ {Symbol: "MemberOf"}, }, } // Create temp directory tmpDir := t.TempDir() // Execute err := GenerateExtensionSQLActiveDirectory(tmpDir, adSchema) // Assert require.NoError(t, err) // Verify file was created sqlPath := filepath.Join(tmpDir, "ad.sql") require.FileExists(t, sqlPath) // Read and verify content content, err := os.ReadFile(sqlPath) require.NoError(t, err) // Assertions on content... }```

and content assertions could be for anything-- node icon/color mapping, traversability flag, or just high level structure. you can read the sql file into a variable for assertions like:

content, _ := os.ReadFile(filepath.Join(tmpDir, "ad.sql")) sql := string(content) // Verify SQL structure assert.Contains(t, sql, "DELETE FROM schema_extensions WHERE name = 'AD'") assert.Contains(t, sql, "INSERT INTO schema_extensions") assert.Contains(t, sql, "INSERT INTO schema_node_kinds") assert.Contains(t, sql, "INSERT INTO schema_edge_kinds") assert.Contains(t, sql, "blah blah ")

LawsonWillard · 2025-12-23T18:19:11Z

Just a heads up, with BED-7067 incoming, we'll need to insert the node and edge kinds into the DAWGS kinds table before inserting them into their respective schema tables

wes-mil added 3 commits December 17, 2025 16:47

BED-6721: generate sql

3614df6

BED-6721: make it work more good

51bf581

BED-6721: run data population

7c6b164

coderabbitai bot reviewed Dec 22, 2025

View reviewed changes

brandonshearin reviewed Dec 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(OpenGraph): Generate AD and AZ extension schemas and apply on startup - BED-6721 #2201

feat(OpenGraph): Generate AD and AZ extension schemas and apply on startup - BED-6721 #2201

Uh oh!

wes-mil commented Dec 22, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 22, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 22, 2025

Uh oh!

coderabbitai bot Dec 22, 2025

Uh oh!

coderabbitai bot Dec 22, 2025

Uh oh!

brandonshearin Dec 23, 2025

Uh oh!

brandonshearin Dec 23, 2025

Uh oh!

brandonshearin Dec 23, 2025

Uh oh!

brandonshearin Dec 23, 2025 •

edited

Loading

Uh oh!

brandonshearin Dec 23, 2025

Uh oh!

LawsonWillard commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat(OpenGraph): Generate AD and AZ extension schemas and apply on startup - BED-6721 #2201

Are you sure you want to change the base?

feat(OpenGraph): Generate AD and AZ extension schemas and apply on startup - BED-6721 #2201

Uh oh!

Conversation

wes-mil commented Dec 22, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How Has This Been Tested?

Screenshots (optional):

Types of changes

Checklist:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

brandonshearin Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

brandonshearin Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

brandonshearin Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

brandonshearin Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brandonshearin Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

LawsonWillard commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wes-mil commented Dec 22, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 22, 2025 •

edited

Loading

brandonshearin Dec 23, 2025 •

edited

Loading