Skip to main content
Connect AnomalyArmor to your Databricks workspace to monitor Unity Catalog assets. We support Delta tables, views, and all Unity Catalog-managed objects.

Requirements

Before connecting, ensure you have:
  • Databricks workspace with Unity Catalog enabled
  • SQL Warehouse (serverless or classic)
  • Personal Access Token or Service Principal credentials
  • Catalog access for the catalogs you want to monitor

Connection Settings

FieldDescriptionExample
Connection NameFriendly identifierDatabricks Production
Workspace URLYour Databricks workspacehttps://xxx.cloud.databricks.com
HTTP PathSQL warehouse path/sql/1.0/warehouses/abc123
CatalogUnity Catalog to monitormain
Access TokenAuthentication tokendapi...

Finding Your Connection Details

Workspace URL

Your workspace URL is in your browser when logged into Databricks:
CloudURL Format
Azurehttps://adb-1234567890.12.azuredatabricks.net
AWShttps://dbc-abc123.cloud.databricks.com
GCPhttps://xxx.gcp.databricks.com

SQL Warehouse HTTP Path

  1. Go to SQL Warehouses in Databricks
  2. Click on your warehouse
  3. Go to Connection Details tab
  4. Copy the HTTP Path
HTTP Path format:
/sql/1.0/warehouses/abc123def456
                    ↑ Your warehouse ID
Use a Serverless SQL Warehouse for best compatibility. Classic warehouses work too but may have startup delays.

Creating an Access Token

Best for quick setup and testing:
  1. Click your username in Databricks → User Settings
  2. Go to Access Tokens tab
  3. Click Generate New Token
  4. Set a description: AnomalyArmor
  5. Set lifetime (or leave blank for no expiry)
  6. Click Generate
  7. Copy the token immediately (you won’t see it again)
Token format: dapi1234567890abcdef1234567890abcdef
Personal access tokens are tied to your user account. If you leave the organization, the token stops working. Consider using a service principal for production.

Granting Catalog Permissions

The user or service principal needs read access to Unity Catalog.
Quick Setup: Download the Databricks permissions script for a ready-to-use SQL template with Unity Catalog grants.
-- Minimal permissions for AnomalyArmor

-- Access to catalog
GRANT USE CATALOG ON CATALOG production TO `anomalyarmor`;

-- Access to all schemas in catalog
GRANT USE SCHEMA ON CATALOG production TO `anomalyarmor`;

-- Read access to tables
GRANT SELECT ON CATALOG production TO `anomalyarmor`;

Per-Schema Permissions

For more granular control:
-- Access specific schemas only
GRANT USE SCHEMA ON SCHEMA production.raw TO `anomalyarmor`;
GRANT USE SCHEMA ON SCHEMA production.staging TO `anomalyarmor`;
GRANT USE SCHEMA ON SCHEMA production.marts TO `anomalyarmor`;

-- Read access per schema
GRANT SELECT ON SCHEMA production.raw TO `anomalyarmor`;
GRANT SELECT ON SCHEMA production.staging TO `anomalyarmor`;
GRANT SELECT ON SCHEMA production.marts TO `anomalyarmor`;

What We Monitor

AnomalyArmor discovers and monitors these Unity Catalog objects:
Object TypeMonitoredNotes
Delta TablesYesIncluding managed and external
ViewsYesStandard and materialized
SchemasYesSchema-level metadata
VolumesNoComing soon
FunctionsNoNot supported

Metadata Captured

For each table and view:
  • Table name and schema
  • Column names and data types
  • Table properties
  • Last modified timestamp (for freshness)
  • Partitioning information

Multiple Catalogs

3-Level Namespace Support

Databricks Unity Catalog uses a 3-level namespace: catalog.schema.table. AnomalyArmor fully supports this structure, enabling you to:
  • Track tables across catalogs: Distinguish between prod.analytics.users and dev.analytics.users
  • Filter by catalog: View only tables from specific catalogs in the UI
  • Catalog-aware alerting: Get notified of changes in production catalogs only
  • Lineage across catalogs: Track data flow between development, staging, and production

Connecting Multiple Catalogs

To monitor multiple catalogs, create separate data sources for each: Data Sources:
  • Databricks Production (catalog: production)
  • Databricks Staging (catalog: staging)
  • Databricks Development (catalog: development)
Each data source needs access to its respective catalog. Use the same token if it has permissions across catalogs.

Catalog-Aware Features

FeatureCatalog Support
Schema DiscoveryTables shown with full catalog.schema.table path
Schema Drift AlertsFilter alerts by catalog
Tag InheritanceTags propagate within catalog boundaries
Table FilteringAPI supports catalog_name filter parameter
Lineage VisualizationShows cross-catalog data dependencies

SQL Warehouse Considerations

Warehouse State

AnomalyArmor queries run on your SQL warehouse. Consider:
Warehouse TypeBehavior
ServerlessAuto-starts, minimal delay
Classic (Auto-stop)May have startup delay (30s-2min)
Classic (Always-on)Immediate, but costs more

Warehouse Sizing

Discovery queries are lightweight. A Small or X-Small warehouse is sufficient:
  • Recommended: Serverless SQL Warehouse
  • Alternative: X-Small Classic Warehouse with auto-stop

Scheduling Discovery

If using a classic warehouse with auto-stop:
  1. Schedule discovery during business hours
  2. Or extend auto-stop timeout to cover discovery windows
  3. Or use serverless (recommended)

Connection Architecture

Databricks connection architecture

What We Query

AnomalyArmor runs these types of queries:
-- List schemas
SHOW SCHEMAS IN CATALOG production;

-- List tables
SHOW TABLES IN SCHEMA production.raw;

-- Get table details
DESCRIBE TABLE EXTENDED production.raw.events;

-- Check freshness (for tables with timestamp columns)
SELECT MAX(event_timestamp) FROM production.raw.events;
Impact: Minimal. These are metadata queries that don’t scan table data.

Troubleshooting

Common causes:
  1. Invalid or expired access token
  2. Wrong workspace URL
  3. Incorrect HTTP path
Solutions:
  1. Generate a new access token
  2. Verify workspace URL matches your browser
  3. Copy HTTP path directly from SQL Warehouse settings
Causes:
  • Token lacks catalog/schema permissions
  • Service principal not granted access
Solutions:
-- Check current permissions
SHOW GRANTS ON CATALOG production;

-- Grant necessary permissions
GRANT USE CATALOG ON CATALOG production TO `your-user`;
GRANT SELECT ON CATALOG production TO `your-user`;
Causes:
  • Wrong HTTP path
  • Warehouse deleted or renamed
Solutions:
  1. Go to SQL Warehouses in Databricks
  2. Copy the HTTP path from Connection Details
  3. Ensure the warehouse exists and is accessible
Causes:
  • Warehouse is starting up
  • Large number of tables
Solutions:
  1. Use a serverless warehouse (faster startup)
  2. Extend warehouse auto-stop timeout
  3. Filter to specific schemas if catalog is very large
Causes:
  • Personal access token has expiry date
Solutions:
  1. Generate a new token with longer expiry
  2. Use a service principal with OAuth (no expiry)
  3. Update the token in AnomalyArmor Data Sources settings

Best Practices

Use Service Principals for Production

Personal access tokens are tied to individual users. If that user leaves:
  • Token stops working
  • Monitoring breaks
Service principals are organization-owned and persist regardless of user changes.

Monitor Production Catalog

Start with your production catalog where schema changes have the most impact.

Schedule Discovery After ETL

If you have predictable ETL schedules, run discovery after ETL completes to catch changes immediately:
ETL Schedule:     2:00 AM daily
Discovery Schedule: 3:00 AM daily (1 hour after ETL)

Next Steps

Run Discovery

Scan your Databricks catalog

Set Up Alerts

Get notified of schema changes