Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
201 changes: 201 additions & 0 deletions rfc/bh-rfc-4.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
---
bh-rfc: 4
title: OpenGraph Extension Fundamental Requirements
authors: |
[Alyx Holms]([email protected])
status: DRAFT
created: 2025-12-15
audiences: |
- BloodHound OpenGraph Extension Authors
- BHE Team
---

# OpenGraph Extension Fundamental Requirements

## 1. Overview

This RFC introduces **OpenGraph Extensions**, a framework for modularly extending BloodHound’s data models. It mandates that all extension-defined attributes (e.g. node kinds, edge kinds, properties) use a unique namespace prefix to prevent conflicts and ensure traceability.

## 2. Motivation & Goals

### 2.1 Extensibility
OpenGraph Extensions enable BloodHound to support modular, community-driven data model extensions. Without a standardized approach to namespacing, multiple extensions could define conflicting attribute names, leading to data integrity issues and ambiguous ownership of schema elements.

### 2.2 Hybrid Paths
Extensions must be able to define hybrid paths, despite their namespacing requirements. This requirement should be maintainable via referencing other extension's kinds when defining hybrid paths.

### 2.3 Intentional Interactions vs Unintentional Collisions
BloodHound needs to support intentional interactions (such as hybrid paths) implicitly through references, and must avoid unintentional collisions through aggressive namespacing for any extension defined identifiers (node kinds, edge kinds, properties, etc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix grammar: compound adjective and abbreviation period.

Line 28 should read extension-defined (with hyphen) and end properties, etc.) (with period before closing paren).

-BloodHound needs to support intentional interactions (such as hybrid paths) implicitly through references, and must avoid unintentional collisions through aggressive namespacing for any extension defined identifiers (node kinds, edge kinds, properties, etc)
+BloodHound needs to support intentional interactions (such as hybrid paths) implicitly through references, and must avoid unintentional collisions through aggressive namespacing for any extension-defined identifiers (node kinds, edge kinds, properties, etc.)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
BloodHound needs to support intentional interactions (such as hybrid paths) implicitly through references, and must avoid unintentional collisions through aggressive namespacing for any extension defined identifiers (node kinds, edge kinds, properties, etc)
BloodHound needs to support intentional interactions (such as hybrid paths) implicitly through references, and must avoid unintentional collisions through aggressive namespacing for any extension-defined identifiers (node kinds, edge kinds, properties, etc.)
🧰 Tools
🪛 LanguageTool

[grammar] ~28-~28: Use a hyphen to join words.
Context: ...aggressive namespacing for any extension defined identifiers (node kinds, edge ki...

(QB_NEW_EN_HYPHEN)


[style] ~28-~28: In American English, abbreviations like “etc.” require a period.
Context: ...rs (node kinds, edge kinds, properties, etc) - Avoid Attribute Collisions - Pr...

(ETC_PERIOD)

🤖 Prompt for AI Agents
In rfc/bh-rfc-4.md around line 28, fix the grammar by changing "extension
defined identifiers" to the hyphenated compound adjective "extension-defined
identifiers" and ensure the parenthetical ends with a period before the closing
parenthesis so the phrase reads "properties, etc.)." Replace the text on that
line accordingly.


- **Avoid Attribute Collisions** - Prevent multiple extensions from defining the same attribute.
- **Clarity of Ownership** - Trace attributes back to their defining extension via prefixes.

## 3. Considerations

### 3.1 Impact on Existing Systems

SharpHound and AzureHound will initially bypass namespace validation to avoid breaking changes. A future migration tool will:

- Add namespaces retroactively (e.g. `AD_` for SharpHound, `EAD_` for Azure/EntraHound).
- No longer bundle these extensions but allow for easy installation.

### 3.2 Security & Compliance

- **No Direct Risks** - Namespacing is a structural constraint, not a security feature.
- **Data Integrity** - Prefixes ensure attribute uniqueness, ensuring schemas have only one source of truth.

### 3.3 Drawbacks & Alternatives

#### 3.3.1 Increased Verbosity

- **Drawback** - Increased verbosity in attribute names (e.g. `GH_User` vs. `User`).
- **Alternative** - Global registry for attribute names (rejected due to centralization).

#### 3.3.2 Multiple Extensions Covering the Same Technology

- **Drawback** - Multiple extensions cannot simply cover the same technology using the same types (e.g. GitHub).
- **Mitigation**:
- Extensions should be focused on their own domain.
- Namespacing prevents conflicts. For hybrid paths between technologies, references will be allowed to types in other namespaces.
Comment on lines +58 to +59
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Adjust list indentation to match Markdown convention.

Lines 58–59 use 4 spaces but should use 2 spaces for nested bullets under the mitigation list.

 - **Mitigation**:
-    - Extensions should be focused on their own domain.
-    - Namespacing prevents conflicts. For hybrid paths between technologies, references will be allowed to types in other namespaces.
+  - Extensions should be focused on their own domain.
+  - Namespacing prevents conflicts. For hybrid paths between technologies, references will be allowed to types in other namespaces.
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

58-58: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


59-59: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

🤖 Prompt for AI Agents
In rfc/bh-rfc-4.md around lines 58 to 59, the nested list items under the
mitigation list use 4-space indentation instead of the 2-space indentation
preferred for Markdown lists; update those two lines to use 2 spaces for the
nested bullets so they render consistently as sub-items under the parent bullet.


## 4. Namespace Declaration

Extensions must declare a namespace in their manifest:

```json
{
"schema": {
"name": "github_hound",
"version": "1.0.0",
"namespace": "GH"
},
"node_kinds": [
{
"symbol": "GH_User",
"representation": "Github User",
"icon": "user",
"color": "#00FF00"
}
],
"environments": [
{
"environmentKind": "GH_Organization",
"sourceKind": "GHBase",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n rfc/bh-rfc-4.md | sed -n '75,100p'

Repository: SpecterOps/BloodHound

Length of output: 876


🏁 Script executed:

cat -n rfc/bh-rfc-4.md | sed -n '180,210p'

Repository: SpecterOps/BloodHound

Length of output: 843


🏁 Script executed:

cat -n rfc/bh-rfc-4.md | sed -n '1,50p'

Repository: SpecterOps/BloodHound

Length of output: 2686


🏁 Script executed:

cat -n rfc/bh-rfc-4.md | sed -n '88,120p'

Repository: SpecterOps/BloodHound

Length of output: 1469


Clarify whether sourceKind is restricted to core kinds or can be extension-defined.

The RFC has unresolved ambiguity around the treatment of sourceKind:

  • Section 5 (line 94) lists GHBase as exempt from prefixing, citing it as "a source kind" not defined by the extension. However, this doesn't explicitly state whether sourceKind itself is restricted to core kinds.

  • Section 10 (line 200) states that sourceKind should be created if it doesn't exist, implying extensions can define their own source kinds—yet no guidance is given on whether extension-defined source kinds must follow the naming convention (e.g., GH_Base vs. GHBase).

  • The manifest examples (lines 83, 190) show sourceKind as GHBase (unprefixed), while environmentKind and principalKinds are consistently prefixed (GH_Organization, GH_User).

Recommendation: Explicitly document in Section 5 whether sourceKind is exclusively core-owned (and thus exempt from prefixing) or whether extensions may define source kinds (in which case the naming rule should apply). Align the rule with the examples and the validation logic in Section 10.

"principalKinds": ["GH_User"]
}
]
}
```

## 5. Attribute Prefixing Rules

- **Format** - `namespace` + `_` + `<type>` (e.g., `GH_User`)
- **Required** - All extension-defined attributes (e.g., `GH_User`, `GH_MemberOf`).
- **Exempt** - References to attributes not defined by the extension (e.g., `"GHBase"` is a source kind).

## 6. Validation

Reject extensions if:

1. Namespace conflicts with an existing extension.
2. Any attribute definition (not reference) lacks the required prefix.

**Example of Invalid Schema**:

```json
{
"namespace": "GH",
...
"node_kinds": [
{ "symbol": "User" }
]
}
```

The above schema is invalid because `User` should be prefixed with the extension's declared namespace (e.g. `GH_User`).

## 7. Handling Customization

Extensions may include default customization for their attributes in the manifest. These will be stored with the extension but will not overwrite existing user-customized definitions. Examples of customization include node icons and colors, which are currently defined in the `custom_node_kinds` table. Extensions can declare icons and colors, but they will only be written to the actual `custom_node_kinds` table if they do not already exist.

## 8. Kinds Table Handling

Extensions should use junction tables when creating relationships with tables owned by DAWGS (e.g., kinds table). Modifying these tables directly is discouraged for performance and reliability reasons.

```mermaid
erDiagram
schema_extensions ||--o{ schema_node_kinds : "extension"
schema_extensions ||--o{ schema_relationship_kinds : "extension"
schema_extensions ||--o{ schema_environments : "extension"

schema_node_kinds ||--|| kinds : "kind_id FK"
schema_relationship_kinds ||--|| kinds : "kind_id FK"
schema_environments ||--|| kinds : "environment_kind_id FK"
schema_environments ||--|| kinds : "source_kind_id FK"
schema_environments ||--o{ schema_environments_principal_kinds : "environment_id FK"
schema_environments_principal_kinds ||--|| kinds : "principal_kind FK"

schema_extensions {
int id
text name
text display_name
text version
text namespace
}

schema_node_kinds {
int id
int extension_id
int kind_id
text display_name
text description
bool is_display_kind
text icon
text color
}

schema_relationship_kinds {
int id
int extension_id
int kind_id
text description
bool is_traversable
}

schema_environments {
int id
int extension_id
int environment_kind_id
int source_kind_id
}

schema_environments_principal_kinds {
int id
int environment_id
int principal_kind
}
```

## 9. Environments and Principal Kinds

Environments define the security boundary of a user's model (e.g., Domain in Active Directory, Tenant in Azure). Principal kinds are nodes that count toward exposure/impact scores (e.g., User, Computer).

**Example Environment Schema**:

```json
{
"environments": [
{
"environmentKind": "GH_Organization",
"sourceKind": "GHBase",
"principalKinds": ["GH_User"]
}
]
}
```

## 10. Validation Rules for Environments

1. Ensure the specified `environmentKind` exists.
2. Ensure the specified `sourceKind` exists (create if it doesn't, reactivate if it does).
3. Ensure all `principalKinds` exist.
Comment on lines +199 to +201
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reduce repetitive sentence structure in validation rules.

Lines 199–201 have three consecutive sentences starting with "Ensure". Vary the structure for clarity.

 1. Ensure the specified `environmentKind` exists.
-2. Ensure the specified `sourceKind` exists (create if it doesn't, reactivate if it does).
-3. Ensure all `principalKinds` exist.
+2. The specified `sourceKind` must exist (or be created and reactivated as needed).
+3. All `principalKinds` must be defined.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
1. Ensure the specified `environmentKind` exists.
2. Ensure the specified `sourceKind` exists (create if it doesn't, reactivate if it does).
3. Ensure all `principalKinds` exist.
1. Ensure the specified `environmentKind` exists.
2. The specified `sourceKind` must exist (or be created and reactivated as needed).
3. All `principalKinds` must be defined.
🧰 Tools
🪛 LanguageTool

[style] ~201-~201: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... it doesn't, reactivate if it does). 3. Ensure all principalKinds exist.

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🤖 Prompt for AI Agents
rfc/bh-rfc-4.md around lines 199 to 201: the three validation rules each begin
with the word "Ensure", producing repetitive sentence structure; rewrite these
three lines to vary sentence starters and improve clarity while preserving
meaning (for example: "Verify the specified `environmentKind` exists.", "If the
specified `sourceKind` does not exist, create it; if it exists but is inactive,
reactivate it.", "Confirm that all `principalKinds` are present."), keeping the
same order and intent.