Deck 01

AI Governance
on Databricks

The Complete Picture

Identity, Authorization, and Tool Governance for Enterprise AI

The Challenge

Every enterprise wants the same thing

Business users

Ask plain-English questions, get governed answers. Search docs. Build dashboards.

AI agents

Multiple agents collaborate on complex tasks. Partners and customers need the same capabilities through their own portals.

Same data. Same AI. Same governance.
Two identity worlds: the human's identity, and the application's identity.

Authentication

One Authentication Layer

Databricks has one authorization layer — Unity Catalog. Authentication can come from multiple sources: the workspace IdP for internal users, or external IdPs via token federation for partners and customers. All paths converge on UC, which enforces governance regardless of how the token was obtained.

Azure

Entra ID

AWS

BYO IdP (Okta, Ping, any SAML/OIDC)

GCP

Cloud Identity or BYO IdP

Multiple authentication paths (workspace IdP, federation, M2M credentials) — but one authorization engine. Unity Catalog is the single trust boundary.

Token Architecture

Three Token Paths
Not Three Auth Systems

U2M, OBO, and M2M are token acquisition paths, not different authentication systems.

Path	Who Authenticates	Identity in UC	Token Acquired By
U2M	The human (directly)	The human	Human's client
OBO	The human (via app)	The human	App, forwarding token
M2M	Service Principal	The SP	The application itself

Single IdP

→

U2M

OBO

M2M

ELI5

The Restaurant Analogy

One restaurant (Databricks). One ID-check at the door (your company IdP). Same bouncer for everyone.

U2M

You walk in yourself, show your badge at the door, sit down and order. The kitchen checks your allergy list, serves your food.

You're the one at the table.

OBO

You're in a meeting, so you send your assistant with your badge. Same door, same bouncer. Kitchen checks your allergy list.

Assistant carries the tray. Never shows their own badge.

M2M

Your company has a catering account. The catering bot shows the company badge. Standard menu, same meal for everyone.

Doesn't matter who placed the order.

The door (IdP) is always the same. The kitchen (UC) always checks the badge that was presented. The only difference is whose badge gets shown.

Decision Framework

Who should UC see as the identity?

THE ACTUAL HUMAN — Does the human call Databricks directly?
Yes → U2M
No, through an app → OBO

THE APPLICATION (SP) — Shared data access, background jobs, RAG pipelines
M2M

Start with the identity question. Everything else follows from the answer.

Resource Model

The Resource Lens

Authorization differs by what you're accessing, not where the app runs.

1

Serving Endpoints
Agent Bricks

2

Genie Spaces

3

UC Functions

4

Vector Search

5

UC HTTP Connections

6

Tables
Row Filters, Column Masks

7

Lakebase
PG-native exception

Authorization Matrix

Resource Auth Matrix

Resource	Recommended Path	Identity in UC	AuthZ Model
Serving Endpoints	OBO or M2M	User or SP	UC + OAuth scopes
Genie	OBO	Calling user	UC + genie scopes
UC Functions	OBO or M2M	User or SP	UC EXECUTE
Vector Search	M2M	App SP	UC SELECT
UC HTTP Connections	M2M + per-user OAuth	SP + external user	USE CONNECTION
Tables	Any	Depends on path	Row filters, column masks
Lakebase	M2M	App SP (PG role)	PG-native (GRANT, RLS) — NOT UC

Defense in Depth

The Six Enforcement Layers

6Audit & Observabilitysystem.access.audit + MLflow traces

5Execution BoundaryModel Serving, Apps, serverless

4Outbound ControlUC Connections, network policies

3Data GovernanceRow Filters, Column Masks, ABAC

2Permission ModelUC privileges, least-privilege SP grants

1IdentityAgent SP, User OBO, Token federation

Never accept a design that relies on a single layer.

Common Pitfall

`current_user()` vs `is_member()`

The #1 source of auth bugs in AI apps.

Path	current_user() Returns	is_member() Evaluates
U2M	Human email	Human's groups ✓
OBO (direct SQL)	Human email	Human's groups ✓
OBO (via Genie / Agent Bricks)	Human email	Execution service identity ✗
M2M	SP UUID	SP's groups ✓

Universal rule: Use current_user() for row filters. It works correctly in every path.

External Identity

The Two-Sided Identity Problem

U2M / OBO / M2M answer "who does Databricks see?" External connections add: "who does the external service see?"

Connection Auth Method	Databricks Sees	External Service Sees
Bearer Token	Caller	Shared
OAuth M2M	Caller	Shared
OAuth U2M Shared	Caller	Shared
OAuth U2M Per User	Caller	Per-user ✓
Managed OAuth	Caller	Per-user ✓

Federation

Token Federation

Any app with a trusted IdP JWT can exchange it for a Databricks token via federation. No secrets.

Account Token Federation

Users & SPs. Requires SCIM sync. 5 issuer limit per account.

Workload Identity Federation

CI/CD pipelines. Per-SP binding. Unlimited issuers. Completely secretless.

Scopes

OAuth Scopes

Operation	Required Scope
Genie	`dashboards.genie` + `genie` both required
Agent Bricks / Serving	`model-serving`
SQL	`sql`
UC / External MCP	`unity-catalog`
Vector Search	`vector-search`
Refresh tokens	`offline_access`

Scopes enforce least privilege at the token level. Request only what you need.

Exception

Lakebase — The Exception

Lakebase uses PG-native authorization (roles, GRANT, RLS), not UC.

AuthN

Databricks OAuth (token as PG password) or native PG roles

AuthZ

PG GRANT + RLS policies. Not Unity Catalog.

Roles

Instance owner (LOGIN, CREATEDB, CREATEROLE). App SP → auto-created PG role. System roles for sync/monitoring.

UC Registration + Lakehouse Sync

Register → read-only UC catalog for cross-source queries. Lakehouse Sync → continuous CDC to Delta (SCD2 history via wal2delta).

PG-native authZ for direct connections. But UC Registration + Lakehouse Sync bridge Lakebase data into UC governance.

Agent Pattern

Agent Architecture Pattern

How agents are governed — mapped to the six enforcement layers.

Identity (Layer 1) — Agent gets a dedicated Service Principal
Permissions (Layer 2) — SP has explicit UC grants — zero default access
Data Governance (Layer 3) — Row filters fire on every data access
Outbound Control (Layer 4) — External calls via UC Connections with USE CONNECTION
Execution Boundary (Layer 5) — Runs in Model Serving / Apps sandbox
Audit (Layer 6) — All traced via MLflow + system.access.audit

Checklist

OBO Prerequisites

User exists in Databricks (SCIM sync active)
Token federation policy or OAuth app integration configured
Row filters use current_user(), not is_member()
App passes a valid Databricks user token (not its own)
Receiving endpoint supports OBO token extraction

Checklist

M2M Prerequisites

Service Principal exists in Databricks account
SP has explicit UC grants (zero default access)
SP credentials available (auto-injected in Apps / Model Serving)
Row filters account for SP identity (current_user() returns SP UUID)

Security

Security Checklist

No secrets in env vars (UC connections hold credentials)
OAuth scopes enforce least privilege
UC governance at SQL engine level
Tool-level RBAC (defense in depth)
Rate limiting (Genie 5 QPM, connection pools)

Async audit (never blocks request path)
Structured JSON logging with request_id
Input validation + HTTPS-only
Retry with jitter (429 / 503)
Tokens never logged

Differentiation

What's Different About This Approach

Identity, not secrets

No PATs, no static credentials. Federation + OAuth everywhere.

Authorization at the resource

UC grants per resource type, not blanket access.

Audit is automatic

system.access.audit captures every API call. MLflow traces every agent step.

Implementation

Getting Started

Configure account-level IdP (Entra ID / BYO SAML/OIDC)
Enable SCIM sync for users and groups
Create Service Principals per capability boundary
Set UC grants with least privilege
Configure UC Connections for external services
Deploy with OBO for user-facing, M2M for background
Monitor via system.access.audit + MLflow

Summary

One IdP.
Three Token Paths.
Six Enforcement Layers.

Same governance model — whether the user is internal or external, whether the app runs on Databricks or outside.

`→` or `Space`	Next slide
`←`	Previous slide
`Home`	First slide
`End`	Last slide
`?`	Toggle this help

AI Governanceon Databricks

The Complete Picture

Every enterprise wants the same thing

Business users

AI agents

One Authentication Layer

Azure

AWS

GCP

Three Token PathsNot Three Auth Systems

The Restaurant Analogy

U2M

OBO

M2M

Who should UC see as the identity?

The Resource Lens

Resource Auth Matrix

The Six Enforcement Layers

current_user() vs is_member()

The Two-Sided Identity Problem

Token Federation

Account Token Federation

Workload Identity Federation

OAuth Scopes

Lakebase — The Exception

AuthN

AuthZ

Roles

UC Registration + Lakehouse Sync

Agent Architecture Pattern

OBO Prerequisites

M2M Prerequisites

Security Checklist

What's Different About This Approach

Identity, not secrets

Authorization at the resource

Audit is automatic

Getting Started

One IdP.Three Token Paths.Six Enforcement Layers.

Keyboard Shortcuts

AI Governance
on Databricks

Three Token Paths
Not Three Auth Systems

`current_user()` vs `is_member()`

One IdP.
Three Token Paths.
Six Enforcement Layers.