⚠️ IMPORTANT - SPECIAL CONFIGURATION NOTICE
Least Privilege Workspaces (LPW) is a specialized Databricks deployment configuration with enhanced security controls. This configuration:
- Requires explicit allowlisting by Databricks support team
- Is NOT available to all accounts - requires Databricks approval and coordination
Before using this module:
- Contact your Databricks Account Team or Databricks Support
- Request approval for Least Privilege Workspace deployment
- Review the VPC-SC policies in
/templates/vpcsc-policy/least-privilege-workspaces/- Coordinate with Databricks for your specific security requirements
For standard workspace deployments, use one of the other configurations in
/templates/terraform-scripts/(byovpc-ws, byovpc-psc-ws, byovpc-cmek-ws, byovpc-psc-cmek-ws, or end2end).This module is intended for highly regulated environments (financial services, healthcare, government) where maximum security posture and explicit access allowlisting are mandatory requirements.
A production-ready Terraform module for deploying Databricks workspaces on Google Cloud Platform (GCP) with complete Unity Catalog governance, compute policies, and SQL warehouses.
This module provides a comprehensive solution for creating enterprise-grade Databricks workspaces on GCP with:
graph TB
subgraph "Phase 1: PROVISIONING"
A[terraform apply -var=phase=PROVISIONING] --> B[Create Network Config]
B --> C[Create Workspace Shell]
C --> D[Databricks Creates Workspace GSA]
D --> E[Manual: Add GSA to Operator Group]
end
subgraph "Phase 2: RUNNING"
E --> F[terraform apply -var=phase=RUNNING]
F --> G[Attach Network to Workspace]
G --> H[Poll Workspace Status]
H --> I{Status = RUNNING?}
I -->|No| H
I -->|Yes| J[Assign Metastore]
J --> K[Create Storage Credentials]
K --> L[Create GCS Buckets + IAM]
L --> M[Create External Locations]
M --> N[Create Catalogs]
N --> O[Create Instance Pools]
O --> P[Create Cluster Policies]
P --> Q[Create SQL Warehouses]
Q --> R[Grant Permissions]
end
⚠️ IMPORTANT: This module does NOT create the foundational infrastructure. The following resources must exist BEFORE deploying:
https://accounts.gcp.databricks.com/us-east4)
databricks-admins, databricks-writers, databricks-readers/26 (64 IPs) - Required/24 (256 IPs) for production workloadsroles/iam.serviceAccountUser - To use service accountsroles/compute.networkAdmin - To configure networkingroles/storage.admin - To create GCS bucketsroles/resourcemanager.projectIamAdmin - To grant IAM permissionsBefore starting deployment, gather the following information:
us-east4)This module creates the following resources (it does NOT create the prerequisites above):
Phase 1:
Phase 2:
The following are NOT created by this module and must exist beforehand:
❌ VPC network ❌ Subnets ❌ Unity Catalog metastore ❌ Databricks account ❌ Databricks groups ❌ GCP projects ❌ Regional Databricks endpoint infrastructure
cd example/
# Copy the example file
cp terraform.tfvars.example terraform.tfvars
# Edit with your values
vim terraform.tfvars
Critical values to configure:
databricks_account_id - Your Databricks account UUIDdatabricks_google_service_account - Service account emailterraform init
terraform plan -var="phase=PROVISIONING"
terraform apply -var="phase=PROVISIONING"
Outputs: Note the workspace_gsa_email from outputs.
Manual Step: Add the workspace GSA (from outputs) to your operator group in Google Workspace or GCP IAM.
# Wait for workspace to reach RUNNING status (check Databricks console)
terraform plan -var="phase=RUNNING"
terraform apply -var="phase=RUNNING"
This will create all workspace resources: pools, policies, Unity Catalog, SQL warehouses, and permissions.
db-<workspace-id>@prod-gcp-<region>.iam.gserviceaccount.comDatabricks workspace creation is asynchronous. The workspace must be fully initialized before resources can be provisioned. Attempting to create resources while the workspace is still initializing causes errors:
Error: cannot create resources: workspace not in RUNNING state
Phase 1: PROVISIONING - Creates an empty workspace shell
db-<workspace-id>@prod-gcp-<region>.iam.gserviceaccount.comPhase 2: RUNNING - Turns the workspace into a functional, running state
RUNNINGRUNNING, safely provisions all resources:
The module includes a workspace status polling mechanism:
The phase variable controls deployment:
phase = "PROVISIONING" # Phase 1: Workspace creation
phase = "RUNNING" # Phase 2: Resource provisioning
Internally, this sets:
expected_workspace_status: What status to wait forprovision_workspace_resources: Boolean flag for resource creationAll workspace resources use this pattern:
for_each = var.provision_workspace_resources ? local.resources_map : {}
lpw/
├── README.md # This file
├── module/ # Reusable Terraform module
│ ├── README.md # Module documentation
│ ├── versions.tf # Provider version requirements
│ ├── providers.tf # Provider configurations
│ ├── variables.tf # Input variables
│ ├── locals.tf # Local values and transformations
│ ├── data.tf # Data source lookups
│ ├── outputs.tf # Module outputs
│ ├── workspace.tf # Workspace creation + status polling
│ ├── catalog*.tf # Unity Catalog resources (4 files)
│ ├── gcs.tf # GCS bucket creation
│ ├── storage_permission.tf # GCS IAM bindings
│ ├── workspace_computes.tf # Instance pools + cluster policies
│ ├── workspace_*_permissions.tf # Permission grants (3 files)
│ ├── sql_warehouse.tf # SQL warehouse creation
│ ├── sql_permissions.tf # SQL warehouse permissions
│ ├── foriegn_catalog_connections.tf # BigQuery connections
│ └── random.tf # Random string generation
└── example/ # Full deployment example with 2-phase logic
├── README.md # Deployment guide
├── main.tf # Module invocation
├── variables.tf # Variable definitions
├── locals.tf # Phase configuration logic
├── outputs.tf # Output forwarding
└── terraform.tfvars.example # Configuration template
Why Two Directories?
module/ - The reusable Terraform module with all resource definitionsexample/ - Shows how to use the module correctly with 2-phase deployment logic# Focus on required fields only
workspace_name = "my-databricks-workspace"
metastore_id = "your-metastore-uuid"
# Network
network_project_id = "my-network-project"
vpc_id = "my-vpc"
subnet_id = "my-subnet"
# One catalog
unity_catalog_config = "[{\"name\": \"main\", \"external_bucket\": \"my-catalog-bucket\", \"shared\": \"false\"}]"
# Complete setup with multiple catalogs, SQL warehouses, and granular permissions
unity_catalog_config = "[
{\"name\": \"prod_data\", \"external_bucket\": \"prod-data-bucket\", \"shared\": \"false\"},
{\"name\": \"dev_data\", \"external_bucket\": \"dev-data-bucket\", \"shared\": \"false\"}
]"
sqlwarehouse_cluster_config = "[
{\"name\": \"small-sql\", \"config\": {\"type\": \"small\", \"max_instance\": 2, \"serverless\": \"true\"}, \"permission\": [...]},
{\"name\": \"large-sql\", \"config\": {\"type\": \"large\", \"max_instance\": 4, \"serverless\": \"true\"}, \"permission\": [...]}
]"
# Multiple groups with different roles
permissions_group_role_user = "data-analysts,data-engineers,data-scientists,admins"
Symptoms: Phase 2 hangs during workspace status polling Solution:
Symptoms: Error: insufficient permissions to create resource
Solution:
Symptoms: Error: cannot attach network to workspace
Solution:
Symptoms: Error creating storage credential
Solution:
The module creates policies for each compute size (Small, Medium, Large):
Permissions are granted at multiple levels:
data_editor, data_reader, data_writerwriter, readerwriter, readerTo create GCS buckets in a different project:
external_project = true
bucket_project_id = "my-data-project"
Optional billing/tracking codes for cost allocation:
costcenter = "CC12345" # Cost center
apmid = "APM000000" # Application Portfolio Management ID
ssp = "SSP000000" # Service & Support Plan ID
trproductid = "0000" # Product tracking ID
sensitive = trueCluster policies use JSON. To update:
cluster_policy_permissions in terraform.tfvarsterraform apply -var="phase=RUNNING"# Update unity_catalog_config with new catalog
unity_catalog_config = "[
{\"name\": \"existing_catalog\", ...},
{\"name\": \"new_catalog\", \"external_bucket\": \"new-bucket\", \"shared\": \"false\"}
]"
# Update permissions
unity_catalog_permissions = "[
{\"name\": \"existing_catalog\", ...},
{\"name\": \"new_catalog\", \"permission\": [...]}
]"
To add more compute capacity:
max_capacity in locals.tfcompute_types = "Small,Medium,Large,XLarge"terraform apply -var="phase=RUNNING"For issues related to:
This module is provided as-is for use with Databricks on GCP deployments.
Contributions are welcome! Please:
Note: This module creates billable resources in both GCP (compute, storage, networking) and Databricks (DBUs). Review pricing before deployment.