A Terraform configuration for deploying the most secure Databricks workspace on Google Cloud Platform (GCP) featuring customer-managed VPC, Private Service Connect for private connectivity, and Customer-Managed Encryption Keys for data encryption.
This deployment creates the most secure Databricks workspace configuration with:
Important: This configuration requires both PSC and CMEK to be enabled for your Databricks account. Contact Databricks support to enable these features.
graph TB
subgraph "GCP Project - Host/Shared VPC"
subgraph "Customer VPC"
SUBNET[Node Subnet<br/>Databricks Clusters]
PSC_SUBNET[PSC Subnet<br/>Private Endpoints]
subgraph "Private Service Connect"
FE_EP[Frontend PSC Endpoint<br/>Workspace UI & REST API]
BE_EP[Backend PSC Endpoint<br/>Cluster Relay]
FE_IP[Frontend Private IP]
BE_IP[Backend Private IP]
end
subgraph "Cloud DNS"
DNS_ZONE[Private DNS Zone<br/>gcp.databricks.com]
A_REC[4 A Records<br/>Workspace URLs]
end
end
subgraph "Cloud KMS"
KEYRING[Key Ring<br/>Customer Key Ring]
KEY[Crypto Key<br/>CMEK for Databricks]
end
end
subgraph "GCP Project - Service/Consumer"
subgraph "Databricks Managed - Encrypted & Private"
GKE[GKE Cluster<br/>Encrypted with CMEK]
GCS[GCS Buckets<br/>Encrypted with CMEK]
DISK[Persistent Disks<br/>Encrypted with CMEK]
end
end
subgraph "Databricks Control Plane - Private"
FE_SA[Frontend Service Attachment]
BE_SA[Backend Service Attachment]
end
subgraph "Users"
USER[Users via VPN<br/>Private Access Only]
end
KEYRING --> KEY
KEY -.Encrypts.-> GCS
KEY -.Encrypts.-> DISK
KEY -.Encrypts.-> GKE
SUBNET --> FE_EP
SUBNET --> BE_EP
FE_EP --> FE_IP
BE_EP --> BE_IP
FE_IP -.PSC.-> FE_SA
BE_IP -.PSC.-> BE_SA
FE_SA --> GKE
BE_SA --> GKE
GKE --> SUBNET
SUBNET --> GCS
DNS_ZONE --> A_REC
A_REC --> FE_IP
A_REC --> BE_IP
USER -.DNS.-> DNS_ZONE
USER -.Private.-> FE_EP
style FE_SA fill:#FF3621
style BE_SA fill:#FF3621
style KEY fill:#FBBC04
style GCS fill:#4285F4
style GKE fill:#4285F4
style DNS_ZONE fill:#34A853
This configuration does NOT include:
../byovpc-cmek-ws/ for reference)For these features, see:
../byovpc-cmek-ws/../end2end/../infra4db/../byovpc-psc-ws/https://accounts.gcp.databricks.comautomation-sa@project.iam.gserviceaccount.com)Critical: You must request both PSC and CMEK enablement for your account before proceeding.
This configuration requires a pre-existing KMS key. Create one first using ../byovpc-cmek-ws/ or manually:
# Create key ring
gcloud kms keyrings create databricks-keyring \
--location=us-central1 \
--project=my-project
# Create crypto key
gcloud kms keys create databricks-key \
--keyring=databricks-keyring \
--location=us-central1 \
--purpose=encryption \
--rotation-period=31536000s \
--project=my-project
Key Requirements:
ENCRYPT_DECRYPTcloudkms.cryptoKeyEncrypterDecrypter roleThis configuration requires a pre-existing VPC with appropriate subnets:
Required Subnets:
/24 CIDR (251 usable IPs)/28 CIDR (11 usable IPs)To create this infrastructure, use ../infra4db/ first.
The service account needs these IAM roles:
On Service/Consumer Project:
roles/compute.networkAdminroles/iam.serviceAccountAdminroles/resourcemanager.projectIamAdminroles/storage.adminroles/cloudkms.cryptoKeyEncrypterDecrypterOn Host/Shared VPC Project:
roles/compute.networkUserroles/compute.securityAdminroles/dns.adminroles/cloudkms.viewerYou need the Databricks PSC service attachment URIs for your region:
Format:
Frontend: projects/prod-gcp-<region>/regions/<region>/serviceAttachments/plproxy-psc-endpoint-all-ports
Backend: projects/prod-gcp-<region>/regions/<region>/serviceAttachments/ngrok-psc-endpoint
Find Service Attachments: Databricks Supported Regions - PSC
gcloud CLI) configuredSince the workspace uses private connectivity:
This configuration provides defense-in-depth security:
Private Service Connect ensures all traffic remains private:
| Traffic Type | Path | Benefit |
|---|---|---|
| User → Workspace UI | User → VPN → PSC Frontend → Databricks | No public exposure |
| Clusters → Control Plane | Clusters → PSC Backend → Databricks | Private relay |
| Clusters → DBFS | Clusters → VPC → GCS | Private storage access |
Customer-Managed Keys encrypt all data:
| Data Type | Encryption | Key Location |
|---|---|---|
| DBFS Storage | CMEK | Your Cloud KMS |
| Notebook State | CMEK | Your Cloud KMS |
| Cluster Disks | CMEK | Your Cloud KMS |
| System Logs | CMEK | Your Cloud KMS |
Benefits:
Multiple access control mechanisms:
public_access_enabled = false: No public internet accessprivate_access_level = "ACCOUNT": Only registered VPC endpointsworkspace.auto.tfvarsPrivate DNS Zone prevents DNS leakage:
graph LR
A[User Access Attempt] --> B{On Corporate VPN?}
B -->|No| C[Access Denied - No Route]
B -->|Yes| D{IP in Allow List?}
D -->|No| E[Access Denied - IP Block]
D -->|Yes| F{DNS Resolves?}
F -->|No| G[Access Denied - No DNS]
F -->|Yes| H{PSC Connected?}
H -->|No| I[Access Denied - Network]
H -->|Yes| J{Databricks Auth?}
J -->|No| K[Access Denied - Auth]
J -->|Yes| L[Access Granted]
L --> M[Data Encrypted with CMEK]
style L fill:#34A853
style M fill:#FBBC04
style C fill:#EA4335
style E fill:#EA4335
style G fill:#EA4335
style I fill:#EA4335
style K fill:#EA4335
# Set the service account to impersonate
gcloud config set auth/impersonate_service_account automation-sa@project.iam.gserviceaccount.com
# Generate access token
export GOOGLE_OAUTH_ACCESS_TOKEN=$(gcloud auth print-access-token)
# Download service account key
gcloud iam service-accounts keys create ~/sa-key.json \
--iam-account=automation-sa@project.iam.gserviceaccount.com
# Set environment variable
export GOOGLE_APPLICATION_CREDENTIALS=~/sa-key.json
Security Best Practice: Use Option 1 (impersonation) to avoid managing key files.
For detailed authentication guide, see ../sa-impersonation.md.
This deployment uses five Terraform providers:
provider "google" {
project = var.google_project_name
region = var.google_region
}
provider "google" {
alias = "vpc_project"
project = var.google_shared_vpc_project
region = var.google_region
}
Required for PSC endpoint creation:
provider "google-beta" {
project = var.google_shared_vpc_project
region = var.google_region
}
provider "databricks" {
alias = "accounts"
host = "https://accounts.gcp.databricks.com"
google_service_account = var.google_service_account_email
}
Used for:
provider "databricks" {
alias = "workspace"
host = databricks_mws_workspaces.databricks_workspace.workspace_url
google_service_account = var.google_service_account_email
}
Used for:
google_vpc_idgoogle_shared_vpc_projectnode_subnet/24 (251 IPs)google_pe_subnet/28 (11 IPs)projects/{project}/locations/{region}/keyRings/{keyring}/cryptoKeys/{key}Required egress rules from node subnet:
# Allow to PSC endpoints
Source: Node subnet
Destination: PSC subnet
Protocols: TCP 443, 6666, 8443-8451
# Allow to GCP APIs (including KMS)
Source: Node subnet
Destination: 0.0.0.0/0
Protocols: TCP 443
# Allow internal cluster communication
Source: Node subnet
Destination: Node subnet
Protocols: TCP/UDP (all ports)
resource "databricks_mws_customer_managed_keys" "this"
Registers:
Key Attributes:
kms_key_id: Full KMS key resource ID from variableuse_cases: [“STORAGE”, “MANAGED”]lifecycle.ignore_changes = all: Prevents updates after creationresource "google_compute_address" "frontend_pe_ip_address"
resource "google_compute_forwarding_rule" "frontend_psc_ep"
Creates:
Configuration:
target: Frontend service attachment URIload_balancing_scheme: Empty (for service attachment)network: Your VPCsubnetwork: PSC subnetresource "google_compute_address" "backend_pe_ip_address"
resource "google_compute_forwarding_rule" "backend_psc_ep"
Creates:
resource "databricks_mws_vpc_endpoint" "workspace_vpce"
resource "databricks_mws_vpc_endpoint" "relay_vpce"
Registers:
Note: Must wait for PSC connection status = “ACCEPTED”
resource "databricks_mws_private_access_settings" "pas"
Key Configuration:
public_access_enabled = false # No public access
private_access_level = "ACCOUNT" # Any account VPC endpoints
Important:
public_access_enabled cannot be changed after creationfalse for fully private workspacetrue if you want optional public access with IP listsAccess Level Options:
| Level | Meaning |
|---|---|
ACCOUNT |
Any VPC endpoints registered in your account can access |
ENDPOINT |
Only explicitly specified VPC endpoints can access |
resource "databricks_mws_networks" "databricks_network"
Associates:
Key Attributes:
vpc_endpoints {
dataplane_relay = [databricks_mws_vpc_endpoint.relay_vpce.vpc_endpoint_id]
rest_api = [databricks_mws_vpc_endpoint.workspace_vpce.vpc_endpoint_id]
}
resource "databricks_mws_workspaces" "databricks_workspace"
Creates:
Key Attributes:
private_access_settings_id: PSC configurationnetwork_id: VPC endpointsstorage_customer_managed_key_id: CMEK for DBFSmanaged_services_customer_managed_key_id: CMEK for notebooksresource "databricks_workspace_conf" "this"
resource "databricks_ip_access_list" "this"
Configures:
Note: When public_access_enabled = false, IP lists don’t affect access (already private).
resource "google_dns_managed_zone" "databricks-private-zone"
Configuration:
gcp.databricks.comFour A records are created automatically via dns.tf:
<workspace-id>.gcp.databricks.com → Frontend IPdp-<workspace-id>.gcp.databricks.com → Frontend IP<region>.psc-auth.gcp.databricks.com → Frontend IPtunnel.<region>.gcp.databricks.com → Backend IPsequenceDiagram
participant TF as Terraform
participant KMS as Cloud KMS
participant GCP as Google Cloud
participant DB_ACC as Databricks Account
participant DB_WS as Databricks Workspace
Note over TF,KMS: Phase 1: CMEK Validation
TF->>KMS: Verify KMS Key Exists
TF->>KMS: Verify Service Account Has Access
KMS-->>TF: Key Accessible
Note over TF,DB_ACC: Phase 2: CMEK Registration
TF->>DB_ACC: Register CMEK with Databricks
DB_ACC->>KMS: Test Key Access
KMS-->>DB_ACC: Access Granted
DB_ACC-->>TF: CMEK ID
Note over TF,GCP: Phase 3: PSC Endpoints
TF->>GCP: Allocate Frontend Private IP
TF->>GCP: Allocate Backend Private IP
TF->>GCP: Create Frontend PSC Endpoint
TF->>GCP: Create Backend PSC Endpoint
GCP->>GCP: Connect to Databricks Service Attachments
GCP-->>TF: PSC Status = ACCEPTED
Note over TF,DB_ACC: Phase 4: VPC Endpoint Registration
TF->>DB_ACC: Register Frontend VPC Endpoint
TF->>DB_ACC: Register Backend VPC Endpoint
DB_ACC-->>TF: VPC Endpoint IDs
Note over TF,DB_ACC: Phase 5: Private Access & Network
TF->>DB_ACC: Create Private Access Settings
TF->>DB_ACC: Create Network Configuration with VPC Endpoints
DB_ACC-->>TF: Configuration IDs
Note over TF,DB_ACC: Phase 6: Encrypted Workspace
TF->>DB_ACC: Create Workspace with PSC + CMEK
DB_ACC->>GCP: Deploy GKE Cluster (Encrypted)
DB_ACC->>GCP: Create GCS Bucket (Encrypted)
DB_ACC->>KMS: Encrypt with Customer Key
GCP-->>DB_ACC: Resources Ready (Encrypted & Private)
DB_ACC-->>TF: Workspace URL
Note over TF,GCP: Phase 7: DNS Configuration
TF->>GCP: Create Private DNS Zone
TF->>GCP: Create 4 DNS A Records
GCP-->>TF: DNS Configured
Note over TF,DB_WS: Phase 8: Workspace Configuration
TF->>DB_WS: Enable IP Access Lists
TF->>DB_WS: Configure Allowed IPs
TF->>DB_WS: Create Admin User
TF->>DB_WS: Add to Admins Group
DB_WS-->>TF: Configuration Complete
Note over DB_WS: Secure Workspace Ready<br/>(Private + Encrypted)
Edit providers.auto.tfvars:
# Service Account for authentication
google_service_account_email = "automation-sa@my-service-project.iam.gserviceaccount.com"
# Service/Consumer Project
google_project_name = "my-service-project"
# Host/Shared VPC Project
google_shared_vpc_project = "my-host-project"
# Region (must match KMS key location)
google_region = "us-central1"
Edit workspace.auto.tfvars:
# Databricks Configuration
databricks_account_id = "12345678-1234-1234-1234-123456789abc"
databricks_account_console_url = "https://accounts.gcp.databricks.com"
databricks_workspace_name = "my-secure-workspace"
databricks_admin_user = "admin@mycompany.com"
# Network Configuration
google_vpc_id = "my-vpc-network"
node_subnet = "databricks-node-subnet"
google_pe_subnet = "databricks-psc-subnet"
# PSC Endpoint Names (must be unique)
workspace_pe = "us-c1-frontend-ep"
relay_pe = "us-c1-backend-ep"
workspace_pe_ip_name = "frontend-pe-ip"
relay_pe_ip_name = "backend-pe-ip"
# PSC Service Attachments (region-specific)
# Find yours at: https://docs.gcp.databricks.com/resources/supported-regions.html#psc
workspace_service_attachment = "projects/prod-gcp-us-central1/regions/us-central1/serviceAttachments/plproxy-psc-endpoint-all-ports"
relay_service_attachment = "projects/prod-gcp-us-central1/regions/us-central1/serviceAttachments/ngrok-psc-endpoint"
# CMEK Configuration (pre-created key)
cmek_resource_id = "projects/my-project/locations/us-central1/keyRings/databricks-keyring/cryptoKeys/databricks-key"
Before deployment:
# Option 1: Service Account Impersonation (Recommended)
gcloud config set auth/impersonate_service_account automation-sa@project.iam.gserviceaccount.com
export GOOGLE_OAUTH_ACCESS_TOKEN=$(gcloud auth print-access-token)
# Option 2: Service Account Key
export GOOGLE_APPLICATION_CREDENTIALS=~/sa-key.json
cd gcp/gh-repo/gcp/terraform-scripts/byovpc-psc-cmek-ws
terraform init
terraform plan
Expected Resources (~25-30 resources):
terraform apply
Deployment Time: ~18-25 minutes
Progress:
terraform output
Expected Outputs:
front_end_psc_status = "Frontend psc status: ACCEPTED"
backend_end_psc_status = "Backend psc status: ACCEPTED"
workspace_url = "https://12345678901234.1.gcp.databricks.com"
Important: Both PSC statuses must be “ACCEPTED”
Check KMS key is being used:
# View key usage logs (requires Cloud Audit Logs)
gcloud logging read "resource.type=cloudkms_cryptokey AND \
resource.labels.key_ring_id=databricks-keyring AND \
resource.labels.crypto_key_id=databricks-key" \
--limit=10 \
--project=my-project
nslookup <workspace-id>.gcp.databricks.com
# Should resolve to frontend private IP
| Output | Description |
|---|---|
workspace_url |
Workspace URL (resolves to private IP) |
front_end_psc_status |
Frontend PSC connection status (should be ACCEPTED) |
backend_end_psc_status |
Backend PSC connection status (should be ACCEPTED) |
service_account |
GCE service account attached to cluster nodes |
ingress_firewall_enabled |
IP access list enabled status |
ingress_firewall_ip_allowed |
Allowed IP addresses |
Error:
Error: cannot register customer-managed key: key not accessible
Solution:
gcloud kms keys describe databricks-key \
--keyring=databricks-keyring \
--location=us-central1 \
--project=my-project
gcloud kms keys get-iam-policy databricks-key \
--keyring=databricks-keyring \
--location=us-central1 \
--project=my-project
gcloud kms keys add-iam-policy-binding databricks-key \
--keyring=databricks-keyring \
--location=us-central1 \
--member="serviceAccount:automation-sa@my-project.iam.gserviceaccount.com" \
--role="roles/cloudkms.cryptoKeyEncrypterDecrypter" \
--project=my-project
Error:
Frontend psc status: PENDING
Backend psc status: PENDING
Solution:
gcloud compute forwarding-rules describe frontend-ep \
--region=us-central1 \
--project=my-host-project
Error: Workspace URL doesn’t resolve from corporate network
Solution:
gcloud dns managed-zones list --project=my-host-project
gcloud dns record-sets list --zone=databricks --project=my-host-project
# From a VM in your VPC
nslookup <workspace-id>.gcp.databricks.com
*.gcp.databricks.com to Cloud DNS# /etc/hosts or C:\Windows\System32\drivers\etc\hosts
10.1.0.5 1234567890123456.1.gcp.databricks.com
Error:
Error: workspace creation failed: unable to encrypt with customer-managed key
Solution:
This means Databricks cannot use the KMS key for encryption:
terraform state show databricks_mws_customer_managed_keys.this
echo "test data" | gcloud kms encrypt \
--key=databricks-key \
--keyring=databricks-keyring \
--location=us-central1 \
--plaintext-file=- \
--ciphertext-file=- \
--project=my-project | base64
gcloud kms keys get-iam-policy databricks-key \
--keyring=databricks-keyring \
--location=us-central1 \
--project=my-project
terraform apply -target=databricks_mws_customer_managed_keys.this
Error: Connection timeout when accessing workspace URL
Causes:
Solution:
# Test connectivity to PSC subnet
ping <frontend-private-ip>
nslookup <workspace-id>.gcp.databricks.com
# Should return frontend private IP, not public IP
public_access_enabled = true):
curl ifconfig.me
# Check this IP is in workspace.auto.tfvars ip_addresses list
telnet <frontend-private-ip> 443
# Verify your source IP can reach PSC subnet
gcloud compute firewall-rules list --project=my-host-project
Error: Clusters fail to start with “Unable to connect to control plane”
Solution:
terraform output backend_end_psc_status
# Must be: "Backend psc status: ACCEPTED"
nslookup tunnel.us-central1.gcp.databricks.com
# Should resolve to backend private IP
telnet <backend-private-ip> 6666
gcloud compute firewall-rules list \
--filter="allowed.ports:6666" \
--project=my-host-project
Critical Error: Workspace becomes completely inaccessible after disabling KMS key
Solution:
IMMEDIATE ACTION REQUIRED:
# List key versions
gcloud kms keys versions list \
--key=databricks-key \
--keyring=databricks-keyring \
--location=us-central1 \
--project=my-project
# Re-enable the primary version
gcloud kms keys versions enable <VERSION> \
--key=databricks-key \
--keyring=databricks-keyring \
--location=us-central1 \
--project=my-project
Wait 10-15 minutes for Databricks to detect re-enabled key
Prevention:
# Check PSC endpoint status
gcloud compute forwarding-rules describe frontend-ep \
--region=us-central1 --project=my-host-project
gcloud compute forwarding-rules describe backend-ep \
--region=us-central1 --project=my-host-project
# Verify DNS records
gcloud dns record-sets list --zone=databricks --project=my-host-project
# Test DNS resolution
nslookup <workspace-id>.gcp.databricks.com
nslookup tunnel.us-central1.gcp.databricks.com
# Check KMS key
gcloud kms keys describe databricks-key \
--keyring=databricks-keyring \
--location=us-central1 \
--project=my-project
# View KMS key IAM policy
gcloud kms keys get-iam-policy databricks-key \
--keyring=databricks-keyring \
--location=us-central1 \
--project=my-project
# Check Terraform state
terraform state show databricks_mws_customer_managed_keys.this
terraform state show google_compute_forwarding_rule.frontend_psc_ep
terraform state show google_compute_forwarding_rule.backend_psc_ep
# View all outputs
terraform output -json | jq
To destroy all resources:
terraform destroy
Destruction Order:
Important Notes:
Time: ~15-20 minutes
After deploying your secure workspace:
../uc/ for Unity Catalog setup../end2end/ for complete workspace with Unity Catalogpublic_access_enabled = false)This configuration is provided as a reference implementation for deploying the most secure Databricks workspaces on GCP with Private Service Connect and Customer-Managed Encryption Keys.