Skip to main content
Audience: Data Engineers, Analytics Engineers Schema changes are one of the most common causes of pipeline failures. A dropped column upstream can cascade into failed dbt runs, broken dashboards, and late-night debugging sessions. This guide shows you how to use AnomalyArmor to catch schema changes before they impact your pipelines.

The Problem

Schema change cascade showing pipeline failure

The Solution

With AnomalyArmor, you’ll know about schema changes before your pipelines run: Pipeline event detection timeline

Setup Guide

Step 1: Connect Your Source Databases

Connect the databases that your pipelines read , not just your warehouse. Common sources to monitor:
  • Production application databases (the ones your dbt reads from)
  • Third-party data sources
  • Shared data lakes
For each source, follow the connection guide.

Step 2: Schedule Frequent Discovery

For pipeline-critical databases, run discovery frequently:
Database TypeRecommended ScheduleWhy
Application databasesHourlyChanges can happen anytime
Shared warehousesEvery 6 hoursLess frequent changes
Third-party sourcesDailyUsually stable
Configure in: Data Sources → [Your Connection] → Settings → Discovery Schedule

Step 3: Create Breaking Change Alerts

Set up alerts specifically for changes that break pipelines: Rule: Breaking Schema Changes (Production)
FieldValue
EventSchema Change Detected
Data Sourceproduction-app-db
Schemapublic
AssetsAll (or list specific tables)
Change TypeColumn Removed, Table Removed, Type Changed
DestinationsSlack #data-engineering, Email data-team@company.com

Step 4: Time Alerts Before Pipeline Runs

If your dbt runs at 3 AM, schedule discovery at 2 AM: Discovery schedule timeline strategy

Advanced: Pre-dbt Validation

Option 1: Webhook Integration

Use webhooks to fail your pipeline early if breaking changes are detected:
  1. Set up a webhook destination in AnomalyArmor
  2. Point it at a validation endpoint in your orchestrator
  3. If webhook fires, block the dbt run
  4. AnomalyArmor Alert fires on schema change
  5. Webhook sent to Airflow/Dagster
  6. Set flag: schema_changes_detected = true
  7. dbt task checks flag before running
  8. If flag = true: Fail fast with meaningful error

Option 2: Discovery Schedule Alignment

Align discovery with your orchestration schedule:
# In your Airflow DAG
discovery_check = SimpleHttpOperator(
    task_id='check_for_schema_changes',
    http_conn_id='anomalyarmor',
    endpoint='/api/v1/discoveries/latest',
    method='GET',
)

run_dbt = BashOperator(
    task_id='run_dbt',
    bash_command='dbt run',
)

discovery_check >> run_dbt

What to Do When Alerts Fire

Immediate Actions

  1. Acknowledge the alert: Let your team know you’re investigating
  2. Check the change details: View in AnomalyArmor: what changed, when, and on which asset
  3. Assess impact: Which models/dashboards use this table?

If the Change is Breaking

  1. Pause affected pipelines (if possible before they run)
  2. Update your dbt models to handle the change
  3. Test locally with the new schema
  4. Deploy the fix before the next scheduled run

If the Change is Expected

  1. Document it: Note in AnomalyArmor or your team wiki
  2. Update downstream: Ensure all dependents are updated
  3. Consider communication: Should you announce to stakeholders?

Model Dependency Mapping

Know which models depend on which tables: Source Table: production.orders
  • stg_orders (staging model)
    • int_orders_enriched (intermediate)
      • fct_orders (fact table)
        • monthly_revenue (dashboard)
        • customer_lifetime_value (analytics)
    • rpt_daily_orders (report)
  • dim_order_status (dimension)
When production.orders changes, all of these are potentially impacted.
Use dbt’s dbt ls --select +models/staging/stg_orders.sql to see all downstream dependencies.

Alert Configuration Examples

PriorityRule NameEventScopeConditionsDestinations
HighRevenue Table ChangesSchema Changeorders, payments, transactionsAny changeSlack #data-critical, PagerDuty
MediumDimension Table ChangesSchema Changedim_*, *_lookupColumn removed or type changedSlack #data-engineering
LowExternal Source ChangesSchema Changeexternal., partner_Any changeEmail (daily digest)

Troubleshooting

  1. Check discovery timing: Did discovery run before the pipeline?
  2. Check scope: Is the table included in the alert rule?
  3. Check conditions: Does the change type match your conditions?
  4. Verify destination: Is the destination configured correctly?
  1. Filter change types: Alert only on Column Removed, Table Removed, Type Changed
  2. Exclude test schemas: Filter out test_*, dev_*
  3. Separate environments: Different rules for prod vs. staging
  1. Use a read replica: Monitor the replica instead of primary
  2. Create a dedicated user: With read-only permissions
  3. Check network access: Firewall rules, security groups

Checklist

Before going live:
  • Connected all source databases that feed pipelines
  • Discovery scheduled to run before pipeline runs
  • Alert rules for breaking changes (column/table removed)
  • Alerts routed to the right channel (data engineering team)
  • Team knows what to do when alerts fire
  • Documented critical table dependencies

Schema Monitoring

Deep dive into schema change detection

Alert Rules

Configure alert conditions