| → / Space | Next slide |
| ← | Previous slide |
| Home | First slide |
| End | Last slide |
| Swipe | Touch navigation |
The same token — U2M, OBO, or M2M — hits different auth enforcement depending on the resource. Seven resources, seven auth stories.
| Resource | Recommended Path | Identity in UC | How Token Arrives | AuthZ Model |
|---|---|---|---|---|
| Serving Endpoints | OBO or M2M | User or SP | Authorization: Bearer |
UC + scopes |
| Genie (via App) | OBO | Calling user | X-Forwarded-Access-Token → Genie API |
UC + genie scopes |
| Genie (direct) | U2M | Calling user | User's own token | UC + genie scopes |
| UC Functions | OBO or M2M | User or SP | Via SQL execution context | UC EXECUTE |
| Vector Search | M2M | App SP | WorkspaceClient() no-args |
UC SELECT |
| UC HTTP Connections | M2M + per-user OAuth | SP + ext user | USE CONNECTION + auth method |
UC + external IdP |
| Tables | Any | Depends on path | Via SQL | Row filters, column masks |
| Lakebase | M2M | App SP (PG role) | OAuth token as PG password | PG-native (NOT UC) |
Agent Bricks and Model Serving
ModelServingUserCredentials()WorkspaceClient()Authorization: Bearer headercurrent_user() = the token's identitymodel-servingAgent Bricks supervisor auto-forwards OBO token to sub-agents (max 20 in chain).
Same OBO token propagates through entire chain. current_user() = human at every hop.
Works fine. Read X-Forwarded-Access-Token from proxy header, pass as Authorization: Bearer to endpoint.
current_user() inside the endpoint = the human.
Wrong API: ModelServingUserCredentials() only works inside Model Serving. Don't use it in Apps code — use the header instead.
User token is lost. App B's proxy strips App A's forwarded user token and replaces it with App A's SP identity.
Workaround: X-Forwarded-Email survives proxy hops (set by proxy from validated token, cannot be forged). Use email + M2M SQL.
Natural language to SQL, governed by UC
dashboards.genie AND genie requiredcurrent_user() = calling humanis_member() = execution service (NOT the human)Rule: Use current_user() for row filters, never is_member().
5 queries per minute (sliding window)
conversation_id for follow-up questionsUI only shows dashboards.genie when configuring app integration scopes.
But the API checks for both genie and dashboards.genie.
Missing genie scope →
403: "required scopes: genie"
PATCH /api/2.0/preview/accounts/
{account_id}/oidc/
custom-app-integration/
{integration_id}
{
"scopes": [
"dashboards.genie",
"genie",
"sql"
]
}
Add genie scope via the account-level API since the UI won't show it.
The safe way to expose write operations to agents
EXECUTE on the functioncurrent_user() = whoever triggered the SQLWrap INSERT/UPDATE in a UC function. Grant EXECUTE to agent SP.
Agent gets EXECUTE, not raw table writes.
Function enforces business logic + authorization internally. Agent never touches table directly.
CREATE FUNCTION prod.tools.approve_deal( deal_id STRING, approver STRING ) RETURNS STRING LANGUAGE SQL RETURN ( UPDATE prod.sales.deals SET status = 'approved', approved_by = approver WHERE id = deal_id AND current_user() IN ( SELECT email FROM prod.sales.approvers ) );
| Agent SP | Has EXECUTE on function |
| Agent SP | Does NOT have UPDATE on table |
| Function | Enforces business logic internally |
| current_user() | Checked inside function body |
Principle: Functions are the authorization boundary for write operations. Grant EXECUTE, not table-level permissions.
Semantic similarity over governed data
SELECT on the index)WorkspaceClient() no-args picks up auto-injected SP credentialsvector-searchNo native per-user context. Vector Search doesn't have OBO support for per-user filtering.
| Pre-filter | Include user_group column in indexed data. Filter in similarity search query. |
| Post-filter | Retrieve top-K, then check user's permissions on each result. |
| Shared KB | M2M is correct — same docs for all users. No filtering needed. |
# Index includes department column results = vs_client.query( index_name="prod.docs.knowledge_idx", query_text=user_query, filters={ "department": user_dept }, num_results=10 )
Filter happens inside the vector search engine. Only matching rows are scored.
Delta Sync index auto-updates when source table changes. No manual re-indexing needed.
Retrieve top-K results, then check each result against user's permissions. Higher latency, but works when permissions are complex or external.
Trade-off: Pre-filter is faster but requires group info in the index. Post-filter is flexible but wastes retrieval budget on filtered-out results.
The governed way to call external services
Governed connectivity to external APIs: Jira, GitHub, Salesforce, Slack, and more.
GRANT USE CONNECTION ON CONNECTION <name> TO <principal>Without UC connections, agents either:
UC connections give you governed, auditable, rotatable external access — with GRANT/REVOKE at runtime.
Who does Databricks see vs. who does the external service see?
| Auth Method | Databricks Sees | External Service Sees | Use Case |
|---|---|---|---|
| Bearer Token | Caller | Shared (one token) | Simple integrations, shared API keys |
| OAuth M2M | Caller | Shared (app creds) | Org-level access |
| OAuth U2M Shared | Caller | Shared (one user's) | Admin-delegated access |
| OAuth U2M Per User | Caller | Per-user ✓ | User-specific (Jira as them, GitHub as them) |
| Managed OAuth | Caller | Per-user ✓ | Google, SharePoint — Databricks handles OAuth |
Only U2M Per User and Managed OAuth give true per-user identity on BOTH sides. All others share a single external identity.
GRANT/REVOKE is runtime — removing USE CONNECTION immediately removes tool accessThere is no way to scope a connection to read-only at the connection level.
The external service must enforce read vs. write permissions. UC controls who can use the connection, not what they can do through it.
Mitigation: Create separate connections with different external credentials (read-only vs. read-write) and GRANT them to different groups.
Data-level authorization, enforced at the SQL engine
current_user() or is_member() to determine accessABAC governed tags for attribute-based rules. Tag columns with sensitivity levels, then write filters that check user attributes against tags.
The #1 gotcha in table-level authorization
| Access Path | current_user() | is_member() |
|---|---|---|
| Notebook (U2M) | Human ✓ | Human's groups ✓ |
| App SQL (OBO direct) | Human ✓ | Human's groups ✓ |
| Genie (OBO) | Human ✓ | Execution service ✗ |
| Agent Bricks (OBO) | Human ✓ | Execution service ✗ |
| Background job (M2M) | SP UUID | SP's groups ✓ |
Rule: Always use current_user()
It returns the correct identity in every access path. is_member() breaks in Genie and Agent Bricks OBO because the execution context is a service, not the user.
The ONLY resource that does NOT use UC for authorization
Direct PG connection (apps, psql): PG roles + GRANT + RLS govern access.
Via lakehouse (notebooks, DBSQL, federation): UC policies apply, not PG grants.
Implication: Same data, different auth model depending on how you access it.
| Instance owner | LOGIN, CREATEDB, CREATEROLE, BYPASSRLS |
databricks_superuser | NOLOGIN; inherits pg_read_all_data + pg_write_all_data + pg_monitor |
System roles (databricks_control_plane, databricks_monitor, databricks_writer_*, databricks_reader_*, databricks_gateway) are auto-created. Do not modify.
Adding Lakebase as app resource auto-creates a PG role = SP client ID.
PGHOST | Auto-injected |
PGUSER | SP client ID = PG role |
PGDATABASE | Auto-injected |
Role gets CONNECT + CREATE. Additional grants (SELECT, INSERT) must be added manually per-table.
Token refresh: @databricks/lakebase (auto) or SDK generate_database_credential() (manual).
Creates a read-only UC catalog mirroring PG database structure. Enables browsing in Catalog Explorer, lineage tracking, and audit logs.
w.postgres.create_catalog(
catalog=Catalog(spec=CatalogCatalogSpec(
postgres_database="mydb",
branch="projects/.../production",
)),
catalog_id="my-catalog",
).wait()
Prereqs: CREATE CATALOG on metastore + serverless SQL warehouse.
-- Join Lakebase + Delta
SELECT c.conversation_id,
u.subscription_tier
FROM lakebase_catalog.public
.conversations c
JOIN main.analytics.users u
ON c.user_id = u.user_id;
Read-only: Cannot modify Lakebase data through UC queries. One database per catalog. Branch-bound.
Max 20 synced tables per source table. Metadata caching — new PG objects may not appear immediately.
Continuous CDC via wal2delta extension. SCD Type 2 history — every insert, update, delete preserved.
-- Step 1: Required before sync ALTER TABLE my_table REPLICA IDENTITY FULL; -- Destination: lb_<table>_history -- Columns: _change_type, -- _timestamp, _lsn, _xid
Gotchas: Partitioned tables unsupported. Schema changes break sync. Re-enable after disable = data loss (no re-snapshot). pgvector/PostGIS types unsupported.
GET | SELECT |
POST | INSERT |
PATCH | UPDATE |
DELETE | DELETE |
Single “authenticator” PG role + RLS policies.
-- RLS per user
CREATE POLICY user_data
ON tasks USING (
user_id = current_setting(
'request.jwt.claims'
)::json->>'sub'
);
Lakebase governs itself (PG) — UC governs everything else. But UC Registration + Lakehouse Sync bridge the two worlds.
| Resource | AuthZ Model | Per-User? | Key Gotcha |
|---|---|---|---|
| Serving Endpoints | UC + scopes | OBO: yes | ModelServingUserCredentials only in Serving context |
| Genie | UC + genie scopes | Yes (OBO) | Needs BOTH genie + dashboards.genie scopes |
| UC Functions | UC EXECUTE | Inherits context | Safe write pattern for agents |
| Vector Search | UC SELECT | Pre/post filter | No native per-user context |
| UC HTTP Connections | USE CONNECTION | Per User OAuth | Two-sided identity problem |
| Tables | Row filters / masks | current_user() |
is_member() fails in Genie/Agent Bricks OBO |
| Lakebase | PG GRANT + RLS | PG RLS | NOT UC — only PG-native exception |
| 1 | Authorization is at the resource, not at the app. |
| 2 | Use current_user() everywhere. It works in every path. |
| 3 | Lakebase is the exception — PG GRANT + RLS, not UC. |