databricks

Databricks Workspace with Unity Catalog on GCP

⚠️ IMPORTANT - SPECIAL CONFIGURATION NOTICE

Least Privilege Workspaces (LPW) is a specialized Databricks deployment configuration with enhanced security controls. This configuration:

Before using this module:

  1. Contact your Databricks Account Team or Databricks Support
  2. Request approval for Least Privilege Workspace deployment
  3. Review the VPC-SC policies in /templates/vpcsc-policy/least-privilege-workspaces/
  4. Coordinate with Databricks for your specific security requirements

For standard workspace deployments, use one of the other configurations in /templates/terraform-scripts/ (byovpc-ws, byovpc-psc-ws, byovpc-cmek-ws, byovpc-psc-cmek-ws, or end2end).

This module is intended for highly regulated environments (financial services, healthcare, government) where maximum security posture and explicit access allowlisting are mandatory requirements.


A production-ready Terraform module for deploying Databricks workspaces on Google Cloud Platform (GCP) with complete Unity Catalog governance, compute policies, and SQL warehouses.

Overview

This module provides a comprehensive solution for creating enterprise-grade Databricks workspaces on GCP with:

Key Features

Workspace Provisioning

Unity Catalog Setup

Compute Resources

SQL Analytics

Architecture

graph TB
    subgraph "Phase 1: PROVISIONING"
        A[terraform apply -var=phase=PROVISIONING] --> B[Create Network Config]
        B --> C[Create Workspace Shell]
        C --> D[Databricks Creates Workspace GSA]
        D --> E[Manual: Add GSA to Operator Group]
    end

    subgraph "Phase 2: RUNNING"
        E --> F[terraform apply -var=phase=RUNNING]
        F --> G[Attach Network to Workspace]
        G --> H[Poll Workspace Status]
        H --> I{Status = RUNNING?}
        I -->|No| H
        I -->|Yes| J[Assign Metastore]
        J --> K[Create Storage Credentials]
        K --> L[Create GCS Buckets + IAM]
        L --> M[Create External Locations]
        M --> N[Create Catalogs]
        N --> O[Create Instance Pools]
        O --> P[Create Cluster Policies]
        P --> Q[Create SQL Warehouses]
        Q --> R[Grant Permissions]
    end

Prerequisites

⚠️ IMPORTANT: This module does NOT create the foundational infrastructure. The following resources must exist BEFORE deploying:

1. Databricks Account Setup (Must Exist)

Account Access

Regional Unity Catalog Metastore (Must Be Created First)

Databricks Groups (Must Exist Before Deployment)

Regional Databricks Infrastructure IDs (Obtain from Databricks Account Team)

2. GCP Infrastructure (Must Exist)

GCP Projects

Network Infrastructure

Service Account with Permissions

3. Terraform Environment

Required Tools

Provider Versions

4. Information Gathering Checklist

Before starting deployment, gather the following information:

From Databricks Account Team:

From GCP:

From Your Organization:

What This Module Creates

This module creates the following resources (it does NOT create the prerequisites above):

Phase 1:

Phase 2:

What You Must Create Separately

The following are NOT created by this module and must exist beforehand:

❌ VPC network ❌ Subnets ❌ Unity Catalog metastore ❌ Databricks account ❌ Databricks groups ❌ GCP projects ❌ Regional Databricks endpoint infrastructure

Quick Start

1. Navigate to Example Directory

cd example/

2. Configure Variables

# Copy the example file
cp terraform.tfvars.example terraform.tfvars

# Edit with your values
vim terraform.tfvars

Critical values to configure:

3. Phase 1: Provisioning

terraform init
terraform plan -var="phase=PROVISIONING"
terraform apply -var="phase=PROVISIONING"

Outputs: Note the workspace_gsa_email from outputs.

Manual Step: Add the workspace GSA (from outputs) to your operator group in Google Workspace or GCP IAM.

4. Phase 2: Running

# Wait for workspace to reach RUNNING status (check Databricks console)
terraform plan -var="phase=RUNNING"
terraform apply -var="phase=RUNNING"

This will create all workspace resources: pools, policies, Unity Catalog, SQL warehouses, and permissions.

What Gets Created

Phase 1: PROVISIONING

Phase 2: RUNNING

2-Phase Deployment Explained

Why 2 Phases?

Databricks workspace creation is asynchronous. The workspace must be fully initialized before resources can be provisioned. Attempting to create resources while the workspace is still initializing causes errors:

Error: cannot create resources: workspace not in RUNNING state

The Solution

Phase 1: PROVISIONING - Creates an empty workspace shell

Phase 2: RUNNING - Turns the workspace into a functional, running state

Status Polling

The module includes a workspace status polling mechanism:

Control Variable

The phase variable controls deployment:

phase = "PROVISIONING"  # Phase 1: Workspace creation
phase = "RUNNING"       # Phase 2: Resource provisioning

Internally, this sets:

All workspace resources use this pattern:

for_each = var.provision_workspace_resources ? local.resources_map : {}

Directory Structure

lpw/
├── README.md                          # This file
├── module/                            # Reusable Terraform module
│   ├── README.md                      # Module documentation
│   ├── versions.tf                    # Provider version requirements
│   ├── providers.tf                   # Provider configurations
│   ├── variables.tf                   # Input variables
│   ├── locals.tf                      # Local values and transformations
│   ├── data.tf                        # Data source lookups
│   ├── outputs.tf                     # Module outputs
│   ├── workspace.tf                   # Workspace creation + status polling
│   ├── catalog*.tf                    # Unity Catalog resources (4 files)
│   ├── gcs.tf                         # GCS bucket creation
│   ├── storage_permission.tf          # GCS IAM bindings
│   ├── workspace_computes.tf          # Instance pools + cluster policies
│   ├── workspace_*_permissions.tf     # Permission grants (3 files)
│   ├── sql_warehouse.tf               # SQL warehouse creation
│   ├── sql_permissions.tf             # SQL warehouse permissions
│   ├── foriegn_catalog_connections.tf # BigQuery connections
│   └── random.tf                      # Random string generation
└── example/                           # Full deployment example with 2-phase logic
    ├── README.md                      # Deployment guide
    ├── main.tf                        # Module invocation
    ├── variables.tf                   # Variable definitions
    ├── locals.tf                      # Phase configuration logic
    ├── outputs.tf                     # Output forwarding
    └── terraform.tfvars.example       # Configuration template

Why Two Directories?

Configuration Examples

Minimal Configuration

# Focus on required fields only
workspace_name = "my-databricks-workspace"
metastore_id   = "your-metastore-uuid"

# Network
network_project_id = "my-network-project"
vpc_id             = "my-vpc"
subnet_id          = "my-subnet"

# One catalog
unity_catalog_config = "[{\"name\": \"main\", \"external_bucket\": \"my-catalog-bucket\", \"shared\": \"false\"}]"

Production Configuration

# Complete setup with multiple catalogs, SQL warehouses, and granular permissions
unity_catalog_config = "[
  {\"name\": \"prod_data\", \"external_bucket\": \"prod-data-bucket\", \"shared\": \"false\"},
  {\"name\": \"dev_data\", \"external_bucket\": \"dev-data-bucket\", \"shared\": \"false\"}
]"

sqlwarehouse_cluster_config = "[
  {\"name\": \"small-sql\", \"config\": {\"type\": \"small\", \"max_instance\": 2, \"serverless\": \"true\"}, \"permission\": [...]},
  {\"name\": \"large-sql\", \"config\": {\"type\": \"large\", \"max_instance\": 4, \"serverless\": \"true\"}, \"permission\": [...]}
]"

# Multiple groups with different roles
permissions_group_role_user = "data-analysts,data-engineers,data-scientists,admins"

Troubleshooting

Common Issues

1. Workspace Stuck in PROVISIONING

Symptoms: Phase 2 hangs during workspace status polling Solution:

2. Permission Denied Errors

Symptoms: Error: insufficient permissions to create resource Solution:

3. Network Attachment Fails

Symptoms: Error: cannot attach network to workspace Solution:

4. Storage Credential Creation Fails

Symptoms: Error creating storage credential Solution:

Advanced Topics

Custom Cluster Policies

The module creates policies for each compute size (Small, Medium, Large):

Unity Catalog Permissions Model

Permissions are granted at multiple levels:

External Project Support

To create GCS buckets in a different project:

external_project  = true
bucket_project_id = "my-data-project"

Billing Tags

Optional billing/tracking codes for cost allocation:

costcenter  = "CC12345"   # Cost center
apmid       = "APM000000" # Application Portfolio Management ID
ssp         = "SSP000000" # Service & Support Plan ID
trproductid = "0000"      # Product tracking ID

Security Considerations

  1. Sensitive Variables: Account IDs, endpoint IDs, and metastore IDs are marked sensitive = true
  2. Authentication: Use Application Default Credentials or service account impersonation (avoid key files)
  3. Network Isolation: Deploy with BYOVPC and PSC for maximum security
  4. Encryption: Enable CMEK for data at rest (configured at workspace level)
  5. Access Control: Grant minimum necessary permissions to groups
  6. Audit: All resources are tagged with owner, team, and environment

Migration and Updates

Updating Compute Policies

Cluster policies use JSON. To update:

  1. Modify policy in Databricks UI
  2. Export policy JSON
  3. Update cluster_policy_permissions in terraform.tfvars
  4. Run terraform apply -var="phase=RUNNING"

Adding New Catalogs

# Update unity_catalog_config with new catalog
unity_catalog_config = "[
  {\"name\": \"existing_catalog\", ...},
  {\"name\": \"new_catalog\", \"external_bucket\": \"new-bucket\", \"shared\": \"false\"}
]"

# Update permissions
unity_catalog_permissions = "[
  {\"name\": \"existing_catalog\", ...},
  {\"name\": \"new_catalog\", \"permission\": [...]}
]"

Workspace Scaling

To add more compute capacity:

  1. Increase instance pool max_capacity in locals.tf
  2. Add larger compute types: compute_types = "Small,Medium,Large,XLarge"
  3. Apply changes: terraform apply -var="phase=RUNNING"

Support

For issues related to:

License

This module is provided as-is for use with Databricks on GCP deployments.

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request with detailed description

Note: This module creates billable resources in both GCP (compute, storage, networking) and Databricks (DBUs). Review pricing before deployment.