Identity & Authorization
by Resource Type

How auth works at each Databricks AI component

The Resource Lens

Authorization differs by what you're accessing

The same token — U2M, OBO, or M2M — hits different auth enforcement depending on the resource. Seven resources, seven auth stories.

⚙

Serving Endpoints

UC + scopes

💬

Genie

UC + genie scopes

𝑓

UC Functions

UC EXECUTE

🔍

Vector Search

UC SELECT

🔗

UC HTTP Connections

USE CONNECTION

🗃

Tables

Row filters & masks

🐘

Lakebase

PG-native (NOT UC)

Overview

Resource Auth Summary

Resource	Recommended Path	Identity in UC	How Token Arrives	AuthZ Model
Serving Endpoints	OBO or M2M	User or SP	`Authorization: Bearer`	UC + scopes
Genie (via App)	OBO	Calling user	`X-Forwarded-Access-Token` → Genie API	UC + genie scopes
Genie (direct)	U2M	Calling user	User's own token	UC + genie scopes
UC Functions	OBO or M2M	User or SP	Via SQL execution context	UC EXECUTE
Vector Search	M2M	App SP	`WorkspaceClient()` no-args	UC SELECT
UC HTTP Connections	M2M + per-user OAuth	SP + ext user	`USE CONNECTION` + auth method	UC + external IdP
Tables	Any	Depends on path	Via SQL	Row filters, column masks
Lakebase	M2M	App SP (PG role)	OAuth token as PG password	PG-native (NOT UC)

Serving Endpoints

Agent Bricks and Model Serving

Identity Model

OBO via ModelServingUserCredentials()
M2M via WorkspaceClient()
Token arrives as Authorization: Bearer header
current_user() = the token's identity
Scope required: model-serving

Key Pattern

Agent Bricks supervisor auto-forwards OBO token to sub-agents (max 20 in chain).

User

→

Supervisor

→

Sub-Agent 1

→

Sub-Agent N

Same OBO token propagates through entire chain. current_user() = human at every hop.

Serving Endpoints

Serving Endpoints — Gotchas

Single App → Serving Endpoint

Works fine. Read X-Forwarded-Access-Token from proxy header, pass as Authorization: Bearer to endpoint.

current_user() inside the endpoint = the human.

Wrong API: ModelServingUserCredentials() only works inside Model Serving. Don't use it in Apps code — use the header instead.

App A → App B (Two-Proxy)

User token is lost. App B's proxy strips App A's forwarded user token and replaces it with App A's SP identity.

Workaround: X-Forwarded-Email survives proxy hops (set by proxy from validated token, cannot be forged). Use email + M2M SQL.

Genie Spaces

Natural language to SQL, governed by UC

Identity Model

Identity: OBO — user's token forwarded to Genie API
Scopes: BOTH dashboards.genie AND genie required
current_user() = calling human
is_member() = execution service (NOT the human)

Rule: Use current_user() for row filters, never is_member().

Rate Limits

5 queries per minute (sliding window)

Thread Management

conversation_id for follow-up questions
New conversation for topic changes
Genie maintains SQL context within a thread

Genie Spaces

Genie — Gotchas

The Scope Bug

UI only shows dashboards.genie when configuring app integration scopes.

But the API checks for both genie and dashboards.genie.

Missing genie scope →

403: "required scopes: genie"

Fix: Patch via API

PATCH /api/2.0/preview/accounts/
  {account_id}/oidc/
  custom-app-integration/
  {integration_id}

{
  "scopes": [
    "dashboards.genie",
    "genie",
    "sql"
  ]
}

Add genie scope via the account-level API since the UI won't show it.

UC Functions

Unity Catalog Functions

The safe way to expose write operations to agents

Identity Model

Identity: inherits from calling context (OBO or M2M)
Grant required: EXECUTE on the function
current_user() = whoever triggered the SQL

Key Pattern

Wrap INSERT/UPDATE in a UC function. Grant EXECUTE to agent SP.

Agent gets EXECUTE, not raw table writes.

Agent

→

UC Function

→

Table

Function enforces business logic + authorization internally. Agent never touches table directly.

UC Functions

UC Functions — Patterns

CREATE FUNCTION prod.tools.approve_deal(
  deal_id STRING,
  approver STRING
)
RETURNS STRING
LANGUAGE SQL
RETURN (
  UPDATE prod.sales.deals
  SET status = 'approved',
      approved_by = approver
  WHERE id = deal_id
    AND current_user() IN (
      SELECT email
      FROM prod.sales.approvers
    )
);

What This Achieves

Agent SP	Has `EXECUTE` on function
Agent SP	Does NOT have `UPDATE` on table
Function	Enforces business logic internally
current_user()	Checked inside function body

Principle: Functions are the authorization boundary for write operations. Grant EXECUTE, not table-level permissions.

Vector Search

Semantic similarity over governed data

Identity Model

Identity: typically M2M (App SP has SELECT on the index)
WorkspaceClient() no-args picks up auto-injected SP credentials
Scope: vector-search

No native per-user context. Vector Search doesn't have OBO support for per-user filtering.

Per-User Filtering Strategies

Pre-filter	Include `user_group` column in indexed data. Filter in similarity search query.
Post-filter	Retrieve top-K, then check user's permissions on each result.
Shared KB	M2M is correct — same docs for all users. No filtering needed.

Vector Search

Vector Search — Patterns

Pre-Filter Pattern

# Index includes department column
results = vs_client.query(
  index_name="prod.docs.knowledge_idx",
  query_text=user_query,
  filters={
    "department": user_dept
  },
  num_results=10
)

Filter happens inside the vector search engine. Only matching rows are scored.

Delta Sync

Delta Sync index auto-updates when source table changes. No manual re-indexing needed.

Post-Filter Pattern

Retrieve top-K results, then check each result against user's permissions. Higher latency, but works when permissions are complex or external.

Trade-off: Pre-filter is faster but requires group info in the index. Post-filter is flexible but wastes retrieval budget on filtered-out results.

UC HTTP Connections

The governed way to call external services

What It Does

Governed connectivity to external APIs: Jira, GitHub, Salesforce, Slack, and more.

Four auth methods: Bearer Token, OAuth M2M, OAuth U2M Shared, OAuth U2M Per User
Access controlled via: GRANT USE CONNECTION ON CONNECTION <name> TO <principal>
Tokens injected by platform, never in code

Why It Matters

Without UC connections, agents either:

Hardcode API keys in environment variables (unauditable)
Use shared service accounts (no per-user identity)
Require custom token management (error-prone)

UC connections give you governed, auditable, rotatable external access — with GRANT/REVOKE at runtime.

UC HTTP Connections

Two-Sided Identity

Who does Databricks see vs. who does the external service see?

Auth Method	Databricks Sees	External Service Sees	Use Case
Bearer Token	Caller	Shared (one token)	Simple integrations, shared API keys
OAuth M2M	Caller	Shared (app creds)	Org-level access
OAuth U2M Shared	Caller	Shared (one user's)	Admin-delegated access
OAuth U2M Per User	Caller	Per-user ✓	User-specific (Jira as them, GitHub as them)
Managed OAuth	Caller	Per-user ✓	Google, SharePoint — Databricks handles OAuth

Only U2M Per User and Managed OAuth give true per-user identity on BOTH sides. All others share a single external identity.

UC HTTP Connections

UC HTTP Connections — Gotchas

Watch Out

GRANT/REVOKE is runtime — removing USE CONNECTION immediately removes tool access
Bearer token is static — rotate manually
U2M Per User requires user to have authorized the external app (OAuth consent)
Managed OAuth: limited providers (Google, SharePoint)

No Read-Only Scope

There is no way to scope a connection to read-only at the connection level.

The external service must enforce read vs. write permissions. UC controls who can use the connection, not what they can do through it.

Mitigation: Create separate connections with different external credentials (read-only vs. read-write) and GRANT them to different groups.

Tables

Tables — Row Filters & Column Masks

Data-level authorization, enforced at the SQL engine

Row Filters

SQL function evaluated at query time, attached to table
Fires regardless of access path: notebook, API, Genie, agent
Uses current_user() or is_member() to determine access

Column Masks

SQL function that transforms column values per user
Same trigger rules as row filters
Common pattern: mask PII for non-privileged users

ABAC governed tags for attribute-based rules. Tag columns with sensitivity levels, then write filters that check user attributes against tags.

Tables

current_user() vs is_member()

The #1 gotcha in table-level authorization

Access Path	current_user()	is_member()
Notebook (U2M)	Human ✓	Human's groups ✓
App SQL (OBO direct)	Human ✓	Human's groups ✓
Genie (OBO)	Human ✓	Execution service ✗
Agent Bricks (OBO)	Human ✓	Execution service ✗
Background job (M2M)	SP UUID	SP's groups ✓

Rule: Always use current_user()

It returns the correct identity in every access path. is_member() breaks in Genie and Agent Bricks OBO because the execution context is a service, not the user.

Lakebase

Lakebase — PG-Native Authorization

The ONLY resource that does NOT use UC for authorization

The Exception

Uses PostgreSQL-native authorization: roles, GRANT, RLS policies
AuthN: Databricks OAuth (token as PG password) or native PG roles + passwords
PG role per SP: app's SP client ID becomes the PG role name

NOT Unity Catalog

Two AuthZ Paths

Direct PG connection (apps, psql): PG roles + GRANT + RLS govern access.

Via lakehouse (notebooks, DBSQL, federation): UC policies apply, not PG grants.

Implication: Same data, different auth model depending on how you access it.

Lakebase

Lakebase — Roles & Apps Integration

Pre-Created Roles

Instance owner	LOGIN, CREATEDB, CREATEROLE, BYPASSRLS
`databricks_superuser`	NOLOGIN; inherits pg_read_all_data + pg_write_all_data + pg_monitor

System roles (databricks_control_plane, databricks_monitor, databricks_writer_*, databricks_reader_*, databricks_gateway) are auto-created. Do not modify.

App SP → PG Role

Adding Lakebase as app resource auto-creates a PG role = SP client ID.

`PGHOST`	Auto-injected
`PGUSER`	SP client ID = PG role
`PGDATABASE`	Auto-injected

Role gets CONNECT + CREATE. Additional grants (SELECT, INSERT) must be added manually per-table.

Token refresh: @databricks/lakebase (auto) or SDK generate_database_credential() (manual).

Lakebase

Lakebase — UC Registration & Query Federation

Register in Unity Catalog

Creates a read-only UC catalog mirroring PG database structure. Enables browsing in Catalog Explorer, lineage tracking, and audit logs.

w.postgres.create_catalog(
  catalog=Catalog(spec=CatalogCatalogSpec(
    postgres_database="mydb",
    branch="projects/.../production",
  )),
  catalog_id="my-catalog",
).wait()

Prereqs: CREATE CATALOG on metastore + serverless SQL warehouse.

Cross-Source Queries

-- Join Lakebase + Delta
SELECT c.conversation_id,
       u.subscription_tier
FROM lakebase_catalog.public
       .conversations c
JOIN main.analytics.users u
  ON c.user_id = u.user_id;

Read-only: Cannot modify Lakebase data through UC queries. One database per catalog. Branch-bound.

Max 20 synced tables per source table. Metadata caching — new PG objects may not appear immediately.

Lakebase

Lakebase — Lakehouse Sync & Data API

Lakehouse Sync (Postgres → Delta)

Continuous CDC via wal2delta extension. SCD Type 2 history — every insert, update, delete preserved.

-- Step 1: Required before sync
ALTER TABLE my_table
  REPLICA IDENTITY FULL;

-- Destination: lb_<table>_history
-- Columns: _change_type,
-- _timestamp, _lsn, _xid

Gotchas: Partitioned tables unsupported. Schema changes break sync. Re-enable after disable = data loss (no re-snapshot). pgvector/PostGIS types unsupported.

Data API (PostgREST)

`GET`	SELECT
`POST`	INSERT
`PATCH`	UPDATE
`DELETE`	DELETE

Single “authenticator” PG role + RLS policies.

-- RLS per user
CREATE POLICY user_data
  ON tasks USING (
    user_id = current_setting(
      'request.jwt.claims'
    )::json->>'sub'
  );

Lakebase governs itself (PG) — UC governs everything else. But UC Registration + Lakehouse Sync bridge the two worlds.

Comparison

All Seven Resources

Resource	AuthZ Model	Per-User?	Key Gotcha
Serving Endpoints	UC + scopes	OBO: yes	`ModelServingUserCredentials` only in Serving context
Genie	UC + genie scopes	Yes (OBO)	Needs BOTH `genie` + `dashboards.genie` scopes
UC Functions	UC EXECUTE	Inherits context	Safe write pattern for agents
Vector Search	UC SELECT	Pre/post filter	No native per-user context
UC HTTP Connections	USE CONNECTION	Per User OAuth	Two-sided identity problem
Tables	Row filters / masks	`current_user()`	`is_member()` fails in Genie/Agent Bricks OBO
Lakebase	PG GRANT + RLS	PG RLS	NOT UC — only PG-native exception

Decision Guide

Which Authorization Path?

Does your resource use UC for authorization?

→

Yes — 6 resources

→

Standard pattern: SP grants, row filters, OAuth scopes

No — Lakebase only

→

PG-native: roles, GRANT, RLS policies

Does the user need per-user identity at the external service?

→

Yes

→

UC Connection with U2M Per User or Managed OAuth

No

→

Bearer or M2M OAuth is fine

Do agents need write access to tables?

→

Yes

→

Wrap in UC Functions — grant EXECUTE, not table writes

No (read only)

→

Row filters + column masks on the table directly

Summary

Seven resources. Six use UC.
One uses PG-native.

1	Authorization is at the resource, not at the app.
2	Use `current_user()` everywhere. It works in every path.
3	Lakebase is the exception — PG GRANT + RLS, not UC.

github.com/bhavink/applied-ai-governance

→ / Space	Next slide
←	Previous slide
Home	First slide
End	Last slide
Swipe	Touch navigation

Navigation

Identity & Authorizationby Resource Type

Authorization differs by what you're accessing

Resource Auth Summary

Serving Endpoints

Identity Model

Key Pattern

Serving Endpoints — Gotchas

Single App → Serving Endpoint

App A → App B (Two-Proxy)

Genie Spaces

Identity Model

Rate Limits

Thread Management

Genie — Gotchas

The Scope Bug

Fix: Patch via API

Unity Catalog Functions

Identity Model

Key Pattern

UC Functions — Patterns

What This Achieves

Vector Search

Identity Model

Per-User Filtering Strategies

Vector Search — Patterns

Pre-Filter Pattern

Delta Sync

Post-Filter Pattern

UC HTTP Connections

What It Does

Why It Matters

Two-Sided Identity

UC HTTP Connections — Gotchas

Watch Out

No Read-Only Scope

Tables — Row Filters & Column Masks

Row Filters

Column Masks

current_user() vs is_member()

Lakebase — PG-Native Authorization

The Exception

Two AuthZ Paths

Lakebase — Roles & Apps Integration

Pre-Created Roles

App SP → PG Role

Lakebase — UC Registration & Query Federation

Register in Unity Catalog

Cross-Source Queries

Lakebase — Lakehouse Sync & Data API

Lakehouse Sync (Postgres → Delta)

Data API (PostgREST)

All Seven Resources

Which Authorization Path?

Seven resources. Six use UC.One uses PG-native.

Identity & Authorization
by Resource Type

Seven resources. Six use UC.
One uses PG-native.