AI Orchestration & Governance on Databricks
01 / 21
github.com/bhavink/applied-ai-governance

AI Orchestration
& Governance

Model Serving, MCP, AI Gateway, and Databricks Apps. One governance model underneath.
The Challenge

AI capabilities are easy to build. Governing them at scale is not.

You have agents that retrieve data, reason over it, and take action. You have natural language interfaces, document search, and tool integrations. Each of these needs to call models, access data, and invoke tools, all while preserving the caller's identity and enforcing access policies at every hop.

The orchestration layer is where identity meets infrastructure: serving endpoints, MCP servers, AI Gateway, and Databricks Apps each play a role. The question is how they compose, and how governance flows through the entire chain without gaps.

This deck maps the four pillars of AI orchestration and shows where governance is enforced at each layer.

Landscape

Four pillars of AI orchestration on Databricks

Each solves a different problem. All governed by Unity Catalog.

Model Serving

Deploy agents, foundation models, and custom models. Three auth methods: automatic passthrough, OBO, and manual.

MCP Servers

Standardized tool calling for agents. Managed (Genie, VS, Functions, SQL), External (via UC connections), or Custom (Databricks Apps).

AI Gateway

Rate limiting, guardrails, inference tables, traffic splitting, and fallback routing for serving endpoints.

Databricks Apps

Managed hosting with automatic OAuth, UC integration, and user identity propagation. Streamlit, Dash, Gradio, React.

Unity Catalog is the common governance layer: permissions, row filters, column masks, audit logs, and connections. Same enforcement regardless of orchestration path.

Agent Auth

Three authentication methods for agents on Model Serving

Automatic Passthrough

Simplest. Databricks manages everything.

HowDeclare resource dependencies at log time
IdentitySystem-generated SP with least-privilege
TokensShort-lived M2M OAuth, auto-rotated
Best forNo per-user access needed
Supports: Vector Search, Model Serving, UC Functions, Genie, SQL Warehouse, UC Table, UC Connection, Lakebase

On-Behalf-Of (OBO)

Agent runs as the calling user.

HowInitialize in predict() with user credentials
Identitycurrent_user() = human email
TokensDownscoped to declared API scopes
Best forPer-user access control + audit
UC enforces row filters, column masks, ACLs per individual user at runtime

Manual Authentication

Explicit credentials. Maximum flexibility.

HowSP OAuth (recommended) or PAT via secrets
IdentityService principal
TokensManual rotation management
Best forExternal resources, prompt registry, no passthrough
Never embed credentials in code. Always use UC secret scopes.

Mix and match: Use automatic for Vector Search, OBO for Genie, and manual for external APIs, all in the same agent.

Agent Auth

Which auth method for which resource?

Per-user access control needed?
OBO Agent runs as calling user
Resource supports automatic passthrough?
AUTOMATIC System SP, zero config
External resource outside Databricks?
MANUAL SP OAuth via secrets
Need different creds than deployer?
MANUAL Explicit credential injection
Prompt registry access?
MANUAL Required for prompt registry

OBO Required Scopes

serving.serving-endpointsModel Serving
vectorsearch.*Vector Search
sql.warehousesSQL Warehouses
dashboards.genieGenie spaces
catalog.connectionsUC connections

Key constraint: OBO user identity is only known at query time. Resources must be initialized in predict(), not __init__().

Databricks Apps

Managed hosting with automatic OAuth and UC integration

User
Browserworkspace IdP
Platform
Apps Proxyauto OAuth
Your Code
App ServerStreamlit, Gradio...
Platform
UC Governancefires as user

What the platform handles

  • Authenticates user via workspace IdP (Entra, Cloud Identity, SSO)
  • Creates scoped OAuth token limited to configured permissions
  • Forwards request with user identity in HTTP headers
  • Your app reads x-forwarded-access-token with no login flows required

What you get

  • Direct access to SQL Warehouses, Model Serving, UC, Jobs
  • Per-user governance: row filters and column masks fire automatically
  • Serverless: no infrastructure to manage
  • Multi-framework: Streamlit, Dash, Gradio, React, Angular
  • CI/CD ready: databricks apps deploy from Git

vs. External hosting: No token exchange, no VPN config, no separate infrastructure. Trade-off: less flexibility for simpler security.

MCP Integration

Three types of MCP servers, all governed by UC

Model Context Protocol standardizes tool calling for agents. Databricks supports three flavors.

Managed MCP

Databricks-hosted, ready to use

GenieNL analytics
Vector SearchRetrieval
UC FunctionsDeterministic tools
DBSQLSQL execution
Auth: Automatic, OBO, or PAT

External MCP

Third-party servers via UC HTTP connections

Patternuc://connections/{name}
ProxyUC handles credential injection
AccessUSE CONNECTION grant
Agent never sees external credentials; UC handles the proxy securely

Custom MCP

Your servers on Databricks Apps

HostingDatabricks Apps
AuthOAuth only (PAT not supported)
Use caseOrg-specific tools, custom logic
Two-proxy caveat: App A calling App B strips user token. Use X-Forwarded-Email for identity.
MCP Auth

MCP authentication and required scopes

Auth methods apply to MCP

  • Automatic Passthrough: declare MCP servers at logging time, system SP handles auth
  • OBO: agent calls MCP with user identity via DatabricksMCPClient
  • Manual: required for custom MCP (OAuth only)

Declare MCP URLs and scopes when logging your agent for verification at deploy time.

Required OAuth scopes by MCP type

mcp.genieGenie spaces
mcp.functionsUC functions
mcp.vectorsearchVector Search
mcp.sql + sql.*DBSQL
mcp.externalExternal MCP via UC connections

Minimize scopes: only request what your agent actually needs.

External clients: Claude, Cursor, ChatGPT, and MCP Inspector can connect to Databricks MCP servers via OAuth or PAT (managed/external only).

External App Auth

Token Federation for apps outside Databricks

Bring your own IdP. No Databricks secrets required.

Your App
External Appweb app, CI/CD, SaaS
Your IdP
IdP JWTOkta, Entra, GitHub
Exchange
Token Endpoint/oidc/v1/token
Access
Databricks APIsscoped OAuth token

Account-wide Federation

For users. Configure at account level. Maps IdP users to Databricks users via SCIM sync. Corporate IdPs like Okta, Entra.

Workload Identity Federation

For automation. Configure on service principal. Maps workload identity (GitHub Actions, GitLab CI, Azure DevOps) to SP. Zero secrets in code.

Azure special case: Entra tokens work directly with Azure Databricks, with no exchange needed. Use MSAL with scope 2ff814a6-.../.default.

Token Exchange

RFC 8693 token exchange in four steps

The exchange flow

1App authenticates with your IdP, receives a JWT
2POST JWT to /oidc/v1/token with grant_type urn:ietf:params:oauth:grant-type:token-exchange
3Databricks validates JWT against federation policy, returns scoped OAuth token
4Use Databricks token to call APIs: Authorization: Bearer <token>

Supported IdPs

Corporate: Okta, Entra, Ping, Auth0

CI/CD: GitHub Actions, Azure DevOps, GitLab CI, CircleCI

Cloud: AWS IAM, GCP Workload Identity, Kubernetes

Any OIDC-compliant provider that issues JWTs with iss, aud, and sub claims.

Security: Handle tokens server-side only. Never expose IdP or Databricks tokens to the browser. Tokens are short-lived; re-exchange when needed.

AI Gateway

Mosaic AI Gateway: centralized governance for serving endpoints

Configured directly on Model Serving endpoints. Governs LLM traffic at the point of consumption.

Rate Limiting

QPM or TPM at four levels: endpoint-wide, per-user default, custom user/SP overrides, and user groups. Max 20 rate limits per endpoint.

AI Guardrails

Safety filtering via Llama Guard 2 (violence, hate speech). PII detection for credit cards, SSN, emails, phone numbers. Options: Block, Mask, or None.

Inference Tables

Auto-log all requests/responses to UC Delta tables. Columns: request, response, status_code, execution_duration_ms, requester.

Traffic Splitting + Fallbacks

Route percentages to different models for A/B testing. Fallbacks auto-redirect on 429/5XX errors. Max 2 fallback models. Set 0% traffic for fallback-only.

Usage Tracking

system.serving.endpoint_usage for token counts and costs. usage_context parameter for per-project or per-user chargeback attribution.

AI Gateway

Feature support by endpoint type

FeatureExternal ModelsFoundation (PT)Foundation (PPT)AgentsCustom
Rate Limiting
Payload Logging
Usage Tracking
AI Guardrails
Fallbacks
Traffic Split

External models get the most complete support: OpenAI, Anthropic, Cohere, Bedrock, Vertex AI, Azure OpenAI, and any OpenAI-compatible endpoint.

Paid features: Payload logging, usage tracking. Free: Permissions, rate limiting, fallbacks, traffic splitting.

Gateway Patterns

Four traffic patterns, four governance layers

1 Agent to Databricks services
No gateway. Direct OBO. UC governs at data plane. Gateway adds latency or breaks identity.
2 Any caller to LLM endpoint
Databricks AI Gateway. Rate limits, guardrails, usage tracking, fallback routing.
3 Agent to external services
UC Connections + SNP. Proxy model with credential authorization. REVOKE = instant removal.
4 External clients to Databricks
External API Gateway. Auth translation, per-tenant rate limiting, API versioning, developer portal.

Patterns 2 + 4 are additive: External API gateway handles the boundary. Databricks AI Gateway governs LLM consumption. UC governs the data. Each layer has its own job.

Pattern 1 Deep Dive

Internal traffic: why a gateway breaks governance

With gateway (broken)

User → App → Gateway → Genie → UC

Gateway has two bad options:

  • Pass through: Adds latency, zero governance value
  • Re-issue token: current_user() = gateway-svc. Row filters fire as the gateway, not the user. Silent data leakage.

UC enforcement happens at the data plane. A gateway can observe HTTP traffic but cannot see which rows were filtered.

Direct OBO (correct)

User → App → Genie → UC

User token arrives at Genie unchanged.

  • current_user() = reviewer@company.com
  • Row filter returns only that reviewer's cases
  • Column masks apply per user's role

Databricks Apps already has a platform-managed proxy that handles OAuth validation and identity injection. An external gateway creates a redundant, conflicting auth layer.

UC Connections

UC Connections: the outbound governance primitive

Application code never calls external services directly. UC proxies the call, checks authorization, injects credentials.

Two-layer defense in depth

Network (SNP)Workspace-level FQDN allowlist. Defines the approved destination universe. Anything not listed is unreachable at network layer.
Credential (UC)Per-app authorization. USE CONNECTION grant controls which SP can authenticate to which service. Enforced before any network traffic.

Access control

-- Grant access
GRANT USE CONNECTION
  ON CONNECTION github_api
  TO `sp-appeals`;

-- Instant revocation
REVOKE USE CONNECTION
  ON CONNECTION github_api
  FROM `sp-billing`;

No redeploy, no code change. Credential stored encrypted in UC; app code never receives the raw value.

Governance assumption: Credentials must live exclusively in UC Connections (not env vars or secrets). Enforce via CI/CD secret scanning + system.access.audit.

Least Privilege

Scope-based access model

ResourceOAuth ScopeUC GrantEnforcement
SQL WarehousesqlCAN USEToken + UC
Genie Spacegenie + dashboards.genieSpace access + tablesToken + UC
Model ServingservingCAN QUERYToken + UC
Vector SearchsqlSELECT on indexToken + UC
UC ConnectionsqlUSE CONNECTIONToken + UC
UC FunctionsqlEXECUTEToken + UC
MCP Servermcp.*Per MCP typeToken + UC

Key insight: Scopes limit what the token can do. Grants limit what the identity can access. Both enforce independently. Revoking either blocks access.

Configure only what your app needs. The list of available scopes grows as Databricks adds capabilities. Principle of least privilege at every layer.

UC Connection Auth

Four ways to authenticate to external services

All four enforce USE CONNECTION on the Databricks side. The difference: what identity does the external service see?

MethodExternal IdentityCredential LifecycleBest For
Bearer TokenShared (one static token)Manual rotationSimple APIs with static keys
OAuth M2MShared (service/app credentials)Auto-refreshService-to-service, no user context
OAuth U2M SharedShared (one user's OAuth token)Auto-refreshOAuth services without M2M support
OAuth U2M Per UserPer user (individual OAuth token)Auto-refresh per userUser-scoped data (Drive, Gmail, repos)

U2M Per User is the only method with true end-to-end per-user identity. Each user completes OAuth consent once. Databricks stores their individual refresh token.

Supported for U2M Per User: Google (Drive, Docs, Gmail, Calendar), GitHub, Glean, SharePoint, and custom OAuth services with standard authorization code flow.

U2M Per User

U2M Per User: true end-to-end identity

Each user authenticates separately. User-scoped data access at the external service.

Flow

1. User triggers action in Databricks App/Agent

2. App calls UC Connection proxy

3. USE CONNECTION check passes

4. First time? User redirected to OAuth consent

Redirect URI: <workspace>/login/oauth/http.html

5. User grants access at external provider

6. Databricks stores per-user refresh token

7. Subsequent calls: auto-refresh, no consent needed

Why it matters

Databricks sideUSE CONNECTION checks current_user()
External sideIndividual user's OAuth token, user sees only their data
User leavesOnly their access breaks, not everyone's
Audit trailPer-user at both Databricks and external service
Gotchas

UC connection setup: common gotchas

redirect_uri_mismatch

The redirect URI in the OAuth provider must exactly match what Databricks sends: <workspace-url>/login/oauth/http.html. No trailing slash, no extra spaces. Check the error details page for the exact URI.

admin_policy_enforced

Organization admin blocks unauthorized third-party OAuth apps. The OAuth Client ID must be allowlisted in the admin console (e.g., Google Admin > Security > API Controls). No scope will work until this is resolved.

offline_access scope

Required to obtain a refresh token. Without it, the connection works for ~1 hour (access token lifetime) then fails silently. Always include this scope for U2M connections.

Immutable name

Connection name cannot be changed after creation. It becomes part of the MCP proxy URL. Choose a stable, descriptive name.

Owner access

Creator has irrevocable USE CONNECTION. Transfer ownership via ALTER CONNECTION ... SET OWNER TO ... if the creator should lose access.

MCP flag locked

isMcpConnection cannot be toggled after creation. Delete and recreate if you need to change it.

Decision Guide

Choosing the right pattern

App on Databricks, users have workspace accounts
OBO via Databricks Apps. Auto OAuth, UC fires as user.
App outside Databricks, users have own IdP
FEDERATION Token exchange (RFC 8693) to role-based SP.
Agent needs to call Genie / FM API / UC services
NO GATEWAY Direct OBO. UC governs at data plane.
Need rate limits / guardrails on LLM endpoint
AI GATEWAY Configure on serving endpoint.
Agent calls external APIs (GitHub, Slack, etc.)
UC CONNECTIONS Proxy + SNP. REVOKE = instant removal.
Agent needs user-scoped external data (personal Drive, Gmail)
U2M PER USER Each user authenticates separately.
External clients need managed API facade
EXT GATEWAY Auth translation + rate limiting + versioning.
CI/CD pipeline needs Databricks access
WORKLOAD ID Federation. Zero secrets in repo.
Summary

What We Covered

3Agent auth methods: Automatic Passthrough, OBO, Manual
3MCP types: Managed, External, and Custom, all UC-governed
4Gateway patterns: No Gateway, AI Gateway, UC Connections, External
4UC connection auth methods: Bearer, M2M, U2M Shared, U2M Per User
2Federation types: Account-wide (users) and Workload Identity (automation)
1Governance model: UC scopes, grants, row filters, column masks, connections

Unity Catalog is the common denominator.

Regardless of orchestration path (Apps, MCP, Gateway, or Federation), UC enforces governance at the data plane.

github.com/bhavink/applied-ai-governance