A Terraform configuration for setting up Unity Catalog on an existing Databricks workspace on Google Cloud Platform (GCP). This configuration adds Unity Catalog with data governance, external storage, and user/group management to workspaces that were created without Unity Catalog.
This is a standalone Unity Catalog configuration that can be applied to existing Databricks workspaces. Unlike the end-to-end configuration (../end2end/), this does not create a workspace—it only sets up Unity Catalog and related data governance resources.
| Aspect | Standalone UC (uc/) |
End-to-End (end2end/) |
|---|---|---|
| Creates Workspace | ❌ No | ✅ Yes |
| Creates Unity Catalog | ✅ Yes | ✅ Yes |
| Workspace Required | ✅ Must exist | ❌ Creates new |
| Use Case | Add UC to existing workspace | New workspace with UC |
| Deployment | On top of existing | Complete from scratch |
graph TB
subgraph "Existing Infrastructure"
WS[Existing Databricks Workspace<br/>Already Deployed]
end
subgraph "Unity Catalog - Added by This Config"
subgraph "Metastore"
META[Unity Catalog Metastore<br/>Central Metadata Repository]
META_BUCKET[GCS Bucket<br/>Metastore Storage]
end
subgraph "Groups"
UC_ADMIN[UC Admins Group]
GROUP1[Data Engineering Group]
GROUP2[Data Science Group]
end
subgraph "Users"
USER1[Admin User 1<br/>Auto-generated]
USER2[Admin User 2<br/>From variable]
USER3[Service Account<br/>From variable]
end
subgraph "Permissions"
WS_ASSIGN1[Data Science → ADMIN]
WS_ASSIGN2[Data Eng → USER]
end
end
META --> META_BUCKET
UC_ADMIN --> USER1
UC_ADMIN --> USER2
UC_ADMIN --> USER3
META --> WS
WS_ASSIGN1 --> WS
WS_ASSIGN2 --> WS
style WS fill:#4285F4
style META fill:#FF3621
style UC_ADMIN fill:#FBBC04
style META_BUCKET fill:#34A853
✅ Perfect for:
❌ Not suitable for:
../end2end/ or workspace-specific configs)Scenario 1: Legacy Workspace Migration
Problem: Workspace created before Unity Catalog
Solution: Apply this config to add UC retroactively
Scenario 2: Phased Deployment
Phase 1: Deploy basic workspace (../byovpc-ws/)
Phase 2: Add Unity Catalog (this config)
Phase 3: Add security features (PSC/CMEK)
Scenario 3: Multiple Workspaces, Single Metastore
Workspace 1: Create with UC (../end2end/)
Workspace 2: Create basic (../byovpc-ws/)
Workspace 3: Create basic (../byovpc-ws/)
Then: Use this config to assign Workspace 2 & 3 to same metastore
⚠️ Critical: You must have an existing, running Databricks workspace.
Required Information:
https://1234567890123456.1.gcp.databricks.com)How to Find Workspace ID:
Option A: From URL
URL: https://1234567890123456.1.gcp.databricks.com
Workspace ID: 1234567890123456
Option B: Via Terraform
# If workspace was created with Terraform
terraform output workspace_id
# Or from state
terraform state show databricks_mws_workspaces.databricks_workspace
Option C: From Account Console
https://accounts.gcp.databricks.comhttps://accounts.gcp.databricks.comautomation-sa@project.iam.gserviceaccount.com)On Service/Consumer Project:
roles/storage.admin (for GCS bucket creation)roles/iam.serviceAccountUserOn Databricks Account:
databricks_admin_user variablegoogle_service_account_email variable❌ Does not create:
For these features, see:
../end2end/unity-objects-management.tf in ../end2end/provider "google" {
project = var.google_project_name
region = var.google_region
}
Used for:
provider "databricks" {
alias = "accounts"
host = "https://accounts.gcp.databricks.com"
google_service_account = var.google_service_account_email
}
Used for:
Important: All Unity Catalog operations at account level must use this provider.
resource "databricks_group" "uc_admins"
Purpose:
Members:
admin_member0)admin_member1)admin_member2)resource "databricks_group" "data_eng"
resource "databricks_group" "data_science"
Purpose:
Created at: Account level (can be used across workspaces)
resource "databricks_metastore" "this"
Configuration:
primary-metastore-<region>-<random-suffix>true (for testing environments)Purpose:
resource "databricks_metastore_data_access" "first"
Creates:
IAM Grants:
roles/storage.objectAdmin (read/write to metastore bucket)roles/storage.legacyBucketReader (list bucket contents)Note: Destroying this resource is not supported by Terraform. Use
terraform state rmbeforeterraform destroy.
resource "databricks_metastore_assignment" "this"
Links:
Critical Configuration:
locals {
workspace_id = "<workspace-id>" # Must be hardcoded
}
resource "databricks_mws_permission_assignment"
Grants:
["ADMIN"] role["USER"] rolePurpose:
sequenceDiagram
participant TF as Terraform
participant GCP as Google Cloud
participant DB_ACC as Databricks Account
participant WS as Existing Workspace
participant UC as Unity Catalog
Note over WS: Workspace Already Exists
Note over TF,DB_ACC: Phase 1: Groups and Users
TF->>DB_ACC: Create UC Admins Group
TF->>DB_ACC: Create Data Engineering Group
TF->>DB_ACC: Create Data Science Group
TF->>DB_ACC: Create/Retrieve Users
TF->>DB_ACC: Add Users to Groups
Note over TF,GCP: Phase 2: Storage
TF->>GCP: Create Metastore GCS Bucket
GCP-->>TF: Bucket Created
Note over TF,UC: Phase 3: Metastore
TF->>UC: Create Unity Catalog Metastore
UC-->>TF: Metastore ID
Note over TF,UC: Phase 4: Storage Credentials
TF->>UC: Create Default Storage Credential
UC-->>TF: Databricks Service Account
TF->>GCP: Grant Bucket Permissions to SA
GCP-->>TF: Permissions Granted
Note over TF,WS: Phase 5: Metastore Assignment
TF->>DB_ACC: Assign Metastore to Workspace
DB_ACC->>WS: Enable Unity Catalog
WS-->>TF: UC Enabled
Note over TF,WS: Phase 6: Workspace Assignments
TF->>DB_ACC: Assign Data Science Group (ADMIN)
TF->>DB_ACC: Assign Data Engineering Group (USER)
DB_ACC-->>TF: Groups Assigned
Note over WS: Workspace Now Has Unity Catalog
Edit providers.auto.tfvars:
# Service Account
google_service_account_email = "automation-sa@my-service-project.iam.gserviceaccount.com"
# Service/Consumer Project
google_project_name = "my-service-project"
# Region (must match workspace region)
google_region = "us-central1"
Edit unity-setup.auto.tfvars:
# Databricks Account ID
databricks_account_id = "12345678-1234-1234-1234-123456789abc"
# UC Admin Group
uc_admin_group_name = "unity-catalog-admins"
# Workspace Groups
group_name1 = "data-engineering"
group_name2 = "data-science"
# Admin User (existing user in your organization)
databricks_admin_user = "admin@mycompany.com"
Edit unity-setup.tf (line 51-54):
# CRITICAL: Update this with your existing workspace ID
locals {
workspace_id = "1234567890123456" # Replace with actual workspace ID
}
How to find workspace ID: See Prerequisites
Before deployment:
localsgoogle_region variable# Option 1: Service Account Impersonation
gcloud config set auth/impersonate_service_account automation-sa@project.iam.gserviceaccount.com
export GOOGLE_OAUTH_ACCESS_TOKEN=$(gcloud auth print-access-token)
# Option 2: Service Account Key
export GOOGLE_APPLICATION_CREDENTIALS=~/sa-key.json
cd gcp/gh-repo/gcp/terraform-scripts/uc
⚠️ CRITICAL STEP: Edit unity-setup.tf and update workspace_id in locals block.
# Line ~51-54 in unity-setup.tf
locals {
workspace_id = "YOUR-WORKSPACE-ID-HERE" # Update this!
}
terraform init
terraform plan
Expected Resources (~15-20 resources):
terraform apply
Deployment Time: ~5-10 minutes
Progress:
terraform output
Expected outputs:
metastore_id = "uuid-of-metastore"
uc_admins_group_id = "group-id"
data_eng_group_id = "group-id"
data_science_group_id = "group-id"
metastore_bucket_name = "unity-metastore-us-central1-xx"
In Workspace UI:
Open a notebook or SQL editor:
-- Show catalogs (should include 'main')
SHOW CATALOGS;
-- Show schemas in main catalog
SHOW SCHEMAS IN main;
-- Create test schema
CREATE SCHEMA main.test_schema;
-- Create test table
CREATE TABLE main.test_schema.test_table (
id INT,
name STRING,
created_at TIMESTAMP
);
-- Insert test data
INSERT INTO main.test_schema.test_table
VALUES (1, 'test', current_timestamp());
-- Query test table
SELECT * FROM main.test_schema.test_table;
-- Verify table is managed by Unity Catalog
DESCRIBE EXTENDED main.test_schema.test_table;
Check Group Memberships:
unity-catalog-adminsdata-engineeringdata-scienceTest Group Permissions:
data-engineering groupTest creating schema in main catalog
data-science group# List metastore bucket contents
gsutil ls gs://unity-metastore-us-central1-xx/
# Verify bucket IAM policy
gcloud storage buckets get-iam-policy gs://unity-metastore-us-central1-xx
Should see Databricks service account with storage.objectAdmin and storage.legacyBucketReader roles.
| Output | Description |
|---|---|
metastore_id |
Unity Catalog metastore UUID |
uc_admins_group_id |
UC Admins group ID |
data_eng_group_id |
Data Engineering group ID |
data_science_group_id |
Data Science group ID |
metastore_bucket_name |
GCS bucket name for metastore storage |
metastore_storage_credential_id |
Default storage credential ID |
View outputs:
terraform output
terraform output -json | jq
terraform output metastore_id
Error:
Error: cannot assign metastore: workspace not found
Solution:
# Via Account Console
# Go to https://accounts.gcp.databricks.com → Workspaces
# Correct
workspace_id = "1234567890123456"
# Incorrect
workspace_id = "https://1234567890123456.1.gcp.databricks.com"
workspace_id = "my-workspace"
# Workspace region must match google_region variable
Error:
Error: workspace already has metastore assigned
Solution:
This workspace already has Unity Catalog. You have two options:
Option A: Use existing metastore (skip this config)
Option B: Reassign to new metastore (manual step required):
terraform applyError:
Error: cannot create storage credential
Solution:
terraform state show databricks_metastore.this
# Check if metastore_assignment resource exists
terraform state show databricks_metastore_assignment.this
Error:
Error: cannot create mws permission assignment: Permission assignment APIs are not available
Solution:
This API requires Unity Catalog to be assigned to workspace first.
terraform state show databricks_metastore_assignment.this
depends_on in workspace assignment resources:
resource "databricks_mws_permission_assignment" "add_admin_group" {
depends_on = [databricks_metastore_assignment.this] # Required!
...
}
terraform apply
Error:
Error: group with name already exists
Solution:
Groups were created previously. Options:
Option A: Import existing group:
terraform import databricks_group.uc_admins \
"<account-id>|<group-id>"
Option B: Use different group name:
# In unity-setup.auto.tfvars
uc_admin_group_name = "unity-catalog-admins-v2"
Option C: Retrieve existing group:
# Change from 'resource' to 'data'
data "databricks_group" "uc_admins" {
provider = databricks.accounts
display_name = var.uc_admin_group_name
}
Error:
Error: destroying metastore data access is not supported
Solution:
This is a known Terraform limitation.
Correct cleanup procedure:
# Step 1: Remove from Terraform state
terraform state rm databricks_metastore_data_access.first
# Step 2: Destroy other resources
terraform destroy
# Step 3: Manually delete metastore (if needed)
# Go to Account Console → Data → Metastores → Delete
# Check workspace info
terraform output workspace_id
# Check metastore
terraform state show databricks_metastore.this
# Check metastore assignment
terraform state show databricks_metastore_assignment.this
# Check storage credential
terraform state show databricks_metastore_data_access.first
# Check groups
terraform state list | grep databricks_group
# Check workspace assignments
terraform state list | grep mws_permission_assignment
# View GCS bucket
gsutil ls gs://unity-metastore-*/
# Check bucket IAM
gcloud storage buckets get-iam-policy gs://unity-metastore-us-central1-xx
# View all outputs
terraform output -json | jq
⚠️ Important considerations:
Step 1: Remove metastore data access from state:
# Required due to Terraform limitation
terraform state rm databricks_metastore_data_access.first
Step 2: Unassign metastore (optional, for reuse):
If you want to keep the metastore but remove it from workspace:
Step 3: Destroy resources:
terraform destroy
What gets destroyed:
force_destroy = true)Step 4: Manual cleanup (if needed):
Delete metastore in Account Console:
https://accounts.gcp.databricks.comAfter successfully adding Unity Catalog to your workspace:
../end2end/unity-objects-management.tf for examplesDEEP CLONE for table migrationcatalog.schema.table)../end2end/cluster_policies.tf for examplesThis configuration is provided as a reference implementation for adding Unity Catalog to existing Databricks workspaces on GCP.