-
Notifications
You must be signed in to change notification settings - Fork 18
Description
This proposal introduces RBAC to support fine-grained permissions on nodes and namespaces.
Core Concepts
Principals
Principals are the entities that can be granted access in the RBAC system. These are already modeled in the users table with a kind field that distinguishes between different principal types:
- Regular authenticated users represent human actors in the system
- Service accounts represent automated processes like CI/CD bots or sync services that need programmatic access
- Groups are collections of users and service accounts, which can be sourced from Postgres for self-contained deployments, or from external systems like LDAP or Google Groups for enterprise deployments.
Roles
A role in this design is a named collection of permissions, represented as scopes.
For example, a finance-data-eng role might grant read and write access to the finance.* namespace while only granting read access to growth.*. A finance-owners role could grant full control (read, write, and manage) over the finance.* namespace. Service accounts like ci-bot-staging might have read, write, and execute permissions on staging.* for automated deployments. A global-viewer role could provide read-only access to all namespaces using a wildcard pattern.
Actions
The system defines four core action types that represent what can be done with a resource. The read action allows viewing node metadata, SQL definitions, and dependencies. The write action permits creating, editing, and updating nodes. The execute action grants the ability to run queries against nodes. Finally, the manage action provides owner-level capabilities, including the ability to grant or revoke access to others.
Scope
A scope defines the resource boundary for an action by combining a scope type with a scope value. The scope type can be either namespace or node, indicating whether the permission applies to a collection of related nodes or a specific individual node. The scope value is a pattern that can be a namespace wildcard like finance.*, a specific node identifier like finance.revenue, or a global wildcard * that matches everything.
The wildcard matching follows intuitive cascading rules. For example, finance.* matches not just direct children like finance.revenue and finance.costs, but also deeply nested nodes like finance.team.subteam.revenue. This cascading behavior means that granting access at a namespace level automatically extends to all current and future nodes within that namespace hierarchy.
Role Assignments
A role assignment is the central operation in the RBAC system. It connects a principal to a role and records metadata about when and by whom the assignment was made. For instance, an assignment might grant the data-eng-team group the growth-editors role, thereby giving all members of that team the permissions defined in that role.
Data Model
The RBAC system is built on three interconnected tables that together define the complete access control model.
The roles table defines named collections of permissions. Each role has a unique identifier, a name that serves as a semantic label, and a description that explains its purpose. These roles are simply containers; the actual permissions are defined separately in the role scopes table.
The role_scopes table defines the individual permissions within each role. Each scope record belongs to a specific role (via role_id) and specifies an action, a scope type, and a scope value. This decomposition allows a single role to grant multiple different permissions across different resources.
The role_assignments table is where principals are granted roles. Each assignment record connects a principal (via principal_id) to a role (via role_id) and includes important metadata: who granted the assignment (granted_by_id), when it was granted (granted_at), and optionally when it expires (expires_at). This expiration mechanism supports temporary access grants, which is useful for contractors or time-limited projects.
For example, when user Alice creates the first node in a new namespace (e.g., users.alice.*), the system automatically creates a namespace-level role (e.g., users.alice-owner) with scopes granting full control over that namespace. Subsequent nodes in the same namespace inherit permissions via wildcard matching, requiring no new role creation.
Use Cases
Node Creation and Auto-Ownership
When a user creates a new node, they should automatically become its owner with full control. This happens through an automatic role assignment. When Alice creates the node finance.revenue, the system creates a role specifically for this node (perhaps named finance.revenue-owner) with scopes for read, write, execute, and manage permissions on that exact node. This role is then immediately assigned to Alice. As a result, Alice owns the node and can grant access to others by assigning additional roles to other principals.
CI/CD Bot as Exclusive Owner
For production safety, organizations often want to ensure that only automated deployment processes can modify staging or production environments, preventing manual edits that might not be tracked or tested. Consider a finance team that wants to manage their nodes using a Git repository, where YAML files define DJ nodes and CI/CD automatically syncs them. To enable this workflow:
- Create a service account called
finance-sync-botthrough the DJ API. - Create a role called
finance-sync-bot-rolewith scopes for read and write permissions on the namespacefinance.*. - Assign this role to the service account. The system provides client credentials (client_id and client_secret) that can be stored as GitHub secrets.
- The CI/CD workflow then uses these credentials to authenticate and sync nodes.
Critically, do not grant any other users or groups write access to this namespace. Users can still query and read from staging nodes to verify deployments, but any attempt to manually edit a node in this namespace will be denied. Only the CI bot, authenticating with its service account credentials, can modify these nodes. This pattern is particularly valuable for production or sensitive namespaces where you want strong change control and audits.
Team Access to Namespace
Imagine the data engineering team needs to collaborate on nodes in the growth namespace. An admin or namespace owner creates a role called growth-editors with scopes for read and write permissions on namespace growth.*. They then assign this role to the group data-eng-team. Now, anyone who is a member of the data engineering team (as determined by the group membership service) can edit nodes in the growth namespace. If a new engineer joins the team, they automatically gain access as soon as they're added to the group, without needing separate permission grants in DJ.
Configurable Default Access
Different organizations have different security postures and collaboration cultures. A startup might want everyone to be able to explore all data freely, while a healthcare company might need to lock down access by default due to regulatory requirements. To support these different needs, DJ can make the default access policy configurable through a setting like DEFAULT_ACCESS_ROLE.
Permission Resolution
When a user attempts to perform an action on a resource, the system must determine whether to allow or deny the request.
- Check if the user is marked as an admin in the database. If so, the request is immediately allowed, bypassing all further checks.
- If the user is not an admin, collect all principals associated with them. This always includes the user themselves, but also includes any groups they belong to. Group membership is determined by querying the configured group membership service, which might check a Postgres table for self-contained deployments or call out to separate APIs for enterprise deployments.
- With the list of principals in hand, the system queries the
role_assignmentstable to find all roles assigned to any of those principals. For each assigned role, it loads the associated role scopes. Now we have a complete list of all permissions the user has, whether directly assigned or inherited through group membership. - Check each scope to see if it grants the requested action on the requested resource. This involves checking if the action matches and if the resource pattern matches. For namespace wildcards, the system implements cascading logic where
xyz.*matches any node that starts withxyz., including deeply nested nodes. - If any scope grants the required permission, the request is allowed. If no explicit grant is found, the system makes one final check: is there a default access role configured? If so, it checks whether that default role's scopes would grant the permission. This is where the configurable default access policy comes into play.
- If neither explicit grants nor the default role provide the necessary permission, the request is denied.