Production-Ready · Open Source

All Things
Databricks

Production-ready infrastructure templates, AI governance patterns, and security architectures for Databricks on Azure, AWS, and GCP — built by Bhavin Kukadia.

200+
Terraform Files
127+
Docs Pages
3
Cloud Providers
13+
Blog Articles

Cloud Deployments

Infrastructure as Code

Azure Databricks

Modular Terraform templates for Non-PL, Full Private (air-gapped), and Hub-Spoke patterns. 8 reusable modules covering networking, workspace, Unity Catalog, Key Vault, private endpoints, and monitoring.

AWS Databricks

Private Link workspace templates with full Data Exfiltration Protection (DEP) controls. Covers VPC design, PrivateLink endpoints, IAM roles, cross-account setups, and S3 data access patterns.

GCP Databricks

VPC Service Controls (VPC-SC), Private Service Connect (PSC), and CMEK implementations. Includes Workload Identity Federation, GCS connectors, and data exfiltration prevention patterns.

AI Governance

Auth & Authorization

Orchestration Architecture

End-to-end governed orchestration hub covering all Databricks AI services in one authoritative reference.

Authentication Patterns

Service Principal passthrough, On-Behalf-Of-User (OBO), and OAuth Token Federation — with full code examples.

Authorization with Unity Catalog

Four-layer access control: workspace restrictions, UC privileges, ABAC governed tags, and row/column-level security.

Genie Space Deep Dive

Multi-team access patterns for 1000+ users, scaling Genie with complex Unity Catalog governance models.

Audit Logging & Monitoring

System tables monitoring and audit queries for tracking AI product usage, access patterns, and governance compliance.

Cross-Cloud Guides

Start Here

Authentication Guide

Set up Terraform authentication for Azure, AWS, or GCP. Zero jargon — step-by-step from scratch.

Networking Guide

Complete multi-cloud networking reference covering VNet, VPC, VPC-SC, Private Link, and troubleshooting flows.

Identities Guide

How Databricks accesses your cloud account — managed identities, service accounts, IAM roles explained clearly.

Common Questions & Answers

Quick answers to the most frequently asked questions about Databricks infrastructure and security setup.

Healthcare Connectors

Spark Declarative Pipelines

Architecture & Design

967-line architecture document covering streaming patterns, resilience strategies, data quality, and medallion lakehouse design for healthcare ingestion.

HL7v2 Pipeline

Full bronze to silver to gold pipeline for HL7v2 messages. 11 curated tables covering ADT, ORM, ORU, and more message types.

FHIR R4 Pipeline

REST API ingestion and streaming pipeline for FHIR R4. 8 curated tables, 1579-line parser with validation, quarantine patterns, and schema evolution support.

Utilities

Helper Scripts

Databricks IP Ranges

Official API-backed tool to extract Databricks IP ranges by region and service type. Supports JSON, CSV, and plain CIDR output formats. Includes runbook and tests.

Archive

Legacy content including Databricks jump-start notebooks, Spark/MLflow/Delta Lake examples, and REST API Postman collections.

Published Articles

Databricks Blog
A Unified Approach to Data Exfiltration Protection on Databricks BigQuery Adds First-Party Support for Delta Lake How Delta Sharing Enables Secure End-to-End Collaboration Data Exfiltration Protection with Azure Databricks View all 13+ articles on Databricks Blog