databricks

Users and Groups Best Practices

This guide provides best practices for managing users and groups in Databricks on Google Cloud, ensuring security, scalability, and efficient administration. Based on the official Databricks documentation: Users and Groups Best Practices.

Identity Management Architecture

graph TB
    subgraph "Identity Provider"
        IDP[Google Cloud Identity<br/>or External IdP]
        SCIM[SCIM Provisioning<br/>Automated Sync]
        SSO[Single Sign-On<br/>SAML/OIDC]
    end

    subgraph "Databricks Account"
        ACCT[Account Console]
        USERS[Users]
        GROUPS[Groups]
        SP[Service Principals<br/>For Automation]
    end

    subgraph "Access Control"
        WS_ACCESS[Workspace Access]
        CLUSTER_ACCESS[Cluster Permissions]
        DATA_ACCESS[Data Permissions<br/>Unity Catalog]
        JOB_ACCESS[Job Permissions]
    end

    subgraph "Audit & Compliance"
        AUDIT[Audit Logs]
        REVIEW[Access Reviews]
        REVOKE[Auto-Revocation<br/>Inactive Users]
    end

    IDP --> SCIM
    SCIM --> USERS
    SCIM --> GROUPS
    SSO --> ACCT

    USERS --> GROUPS
    GROUPS --> WS_ACCESS
    GROUPS --> CLUSTER_ACCESS
    GROUPS --> DATA_ACCESS
    GROUPS --> JOB_ACCESS

    SP --> JOB_ACCESS

    WS_ACCESS --> AUDIT
    DATA_ACCESS --> AUDIT
    AUDIT --> REVIEW
    REVIEW --> REVOKE

    style IDP fill:#4285F4
    style SCIM fill:#43A047
    style GROUPS fill:#1E88E5
    style SP fill:#FF6F00
    style AUDIT fill:#8E24AA

Table of Contents

  1. Use Identity Federation
  2. Follow Least Privilege Access
  3. Use Groups for Role-Based Access Control
  4. Manage Service Principals for Automation
  5. Audit and Monitor Access
  6. Review and Revoke Unused Access

Use Identity Federation

Follow Least Privilege Access

Use Groups for Role-Based Access Control

Role-Based Access Control (RBAC) Model

graph TB
    subgraph "User Groups"
        DATA_ENG[Data Engineers Group]
        DATA_SCI[Data Scientists Group]
        DATA_ANAL[Data Analysts Group]
        ADMIN[Admins Group]
    end

    subgraph "Workspace Permissions"
        WS_ADMIN[Workspace Admin<br/>Full Control]
        WS_USER[Workspace User<br/>Standard Access]
        WS_VIEW[Workspace Viewer<br/>Read-only]
    end

    subgraph "Cluster Permissions"
        CLUSTER_CREATE[Can Create Clusters]
        CLUSTER_RESTART[Can Restart Clusters]
        CLUSTER_ATTACH[Can Attach to Clusters]
    end

    subgraph "Data Permissions"
        UC_ADMIN[Unity Catalog Admin<br/>Manage Catalogs]
        DATA_OWNER[Data Owner<br/>Full Table Access]
        DATA_READ[Data Reader<br/>SELECT Only]
    end

    ADMIN --> WS_ADMIN
    ADMIN --> CLUSTER_CREATE
    ADMIN --> UC_ADMIN

    DATA_ENG --> WS_USER
    DATA_ENG --> CLUSTER_CREATE
    DATA_ENG --> DATA_OWNER

    DATA_SCI --> WS_USER
    DATA_SCI --> CLUSTER_ATTACH
    DATA_SCI --> DATA_READ

    DATA_ANAL --> WS_VIEW
    DATA_ANAL --> CLUSTER_ATTACH
    DATA_ANAL --> DATA_READ

    style ADMIN fill:#E53935
    style DATA_ENG fill:#1E88E5
    style DATA_SCI fill:#43A047
    style DATA_ANAL fill:#FF6F00
    style WS_ADMIN fill:#FDD835

Manage Service Principals for Automation

Service Principal Usage Flow

sequenceDiagram
    participant Admin
    participant ACCT as Account Console
    participant SP as Service Principal
    participant JOB as Databricks Job
    participant API as Databricks API
    participant DATA as Data Sources

    Admin->>ACCT: Create Service Principal
    ACCT-->>SP: SP Created with Client ID

    Admin->>SP: Generate Access Token
    SP-->>Admin: Token (expires in N days)

    Admin->>SP: Assign Permissions<br/>(Job runner, Data access)

    Note over SP,JOB: Automated Job Execution

    JOB->>SP: Authenticate with Token
    SP->>ACCT: Validate Token
    ACCT-->>SP: Token Valid
    SP->>JOB: Authorization Granted
    JOB->>DATA: Access Data<br/>(via SP permissions)
    DATA-->>JOB: Data Returned

    Note over API: API Access

    API->>SP: API Call with Token
    SP->>ACCT: Validate & Authorize
    ACCT-->>API: Request Completed

    Note over Admin,SP: Best Practices:<br/>- Rotate tokens regularly<br/>- Use least privilege<br/>- One SP per application

Audit and Monitor Access

Review and Revoke Unused Access

Access Lifecycle Management

stateDiagram-v2
    [*] --> Onboarding: New User/Employee
    Onboarding --> Provisioned: SCIM Auto-provision
    Provisioned --> GroupAssignment: Assign to Groups
    GroupAssignment --> Active: Access Granted

    Active --> AccessReview: Quarterly Review
    AccessReview --> Active: Access Still Needed
    AccessReview --> Modified: Role Changed
    AccessReview --> Inactive: No Activity Detected

    Modified --> GroupAssignment: Update Permissions

    Inactive --> Warning90Days: 90 Days No Activity
    Warning90Days --> Disabled: Auto-disable Account

    Active --> Offboarding: Employee Leaves
    Offboarding --> Revoked: Remove All Access

    Disabled --> [*]
    Revoked --> [*]

    note right of Active
        Continuous monitoring
        Audit logs tracked
        Anomaly detection
    end note

    note right of Disabled
        Account disabled
        Can be re-enabled
        Data preserved
    end note

    note right of Revoked
        Immediate revocation
        No grace period
        Compliance requirement
    end note

Audit Monitoring Dashboard

graph TB
    subgraph "Audit Data Sources"
        LOGS[Audit Logs<br/>Account & Workspace]
        SYS[System Tables<br/>system.access.*]
        GCS[GCS Audit Logs<br/>Storage Access]
    end

    subgraph "Monitoring Activities"
        LOGIN[Login Attempts<br/>Success/Failure]
        PERM[Permission Changes<br/>Grants/Revokes]
        DATA[Data Access<br/>Table Queries]
        API[API Calls<br/>Automation Activity]
    end

    subgraph "Alerting"
        ANOM[Anomaly Detection<br/>Unusual Patterns]
        FAILED[Failed Access<br/>Multiple Attempts]
        PRIVESC[Privilege Escalation<br/>Admin Changes]
        EXFIL[Data Exfiltration<br/>Large Exports]
    end

    subgraph "Actions"
        NOTIFY[Notify Security Team]
        SUSPEND[Suspend Account]
        REVIEW[Trigger Review]
        TICKET[Create Incident Ticket]
    end

    LOGS --> LOGIN
    LOGS --> PERM
    SYS --> DATA
    SYS --> API
    GCS --> DATA

    LOGIN --> FAILED
    PERM --> PRIVESC
    DATA --> EXFIL
    API --> ANOM

    ANOM --> NOTIFY
    FAILED --> SUSPEND
    PRIVESC --> REVIEW
    EXFIL --> TICKET

    style LOGS fill:#1E88E5
    style ANOM fill:#FF6F00
    style FAILED fill:#E53935
    style PRIVESC fill:#E53935
    style EXFIL fill:#E53935
    style NOTIFY fill:#FDD835

For more details, refer to the official Databricks Users and Groups Best Practices.


Following these best practices ensures secure and efficient user and group management in Databricks on Google Cloud.