Skip to main content
Data quality metrics let you track statistical properties of your columns over time. AnomalyArmor captures metric values on a schedule, builds historical baselines, and automatically detects when values fall outside expected ranges.
Looking for row count monitoring? Use Row Count Monitoring for tracking row counts with ML-based anomaly detection or explicit thresholds.
Prerequisites: Before creating metrics, you need:
Example scenario: The customer_email column normally has ~3% null values. On Jan 30, null percentage jumped to 12.3%, well outside the expected range band. AnomalyArmor flags this as an anomaly, indicating a potential data quality issue in the source system.

Why Use Metrics

Freshness tells you when data was updated. Completeness tells you how much arrived. Metrics tell you what changed at the column level:
IssueFreshnessCompletenessMetrics
ETL job failed completelyDetects itDetects itDetects it
ETL ran but loaded 0 rowsMight miss itCatches itN/A
Data loaded but 50% nullsMisses itMisses itCatches it
Unexpected duplicatesMisses itMisses itCatches it
Values outside valid rangeMisses itMisses itCatches it
Use freshness for “did data arrive on time?” Use row count monitoring for “did the right amount of data arrive?” Use metrics for “is the column-level data quality correct?”

Metric Types

All metrics require a specific column to monitor:
TypeDescriptionBest For
null_percentPercentage of null valuesDetecting missing data
distinct_countCount of unique valuesCardinality monitoring
duplicate_countCount of repeated valuesData quality checks
min_valueMinimum numeric valueRange validation
max_valueMaximum numeric valueOutlier detection
avg_valueAverage numeric valueCentral tendency
percentileNth percentile valueDistribution analysis

Creating a Metric

1

Navigate to the Asset

Go to Assets and select the table you want to monitor.
2

Open Metrics Tab

Click the Metrics tab on the asset detail page.
3

Create New Metric

Click Create Metric to open the metric configuration form.
4

Select Metric Type

Choose the type of metric you want to track:
  • null_percent: Percentage of null values in a column
  • distinct_count: Number of unique values
  • duplicate_count: Number of duplicate values
  • min/max/avg: Numeric range and central tendency
  • percentile: Distribution analysis
Need to monitor row counts? Use Row Count Monitoring instead.
5

Configure Capture Interval

Choose how often to capture the metric:
IntervalBest For
HourlyHigh-frequency data, real-time tables
DailyMost batch ETL pipelines
WeeklySlowly changing data
6

Enable Anomaly Detection

Toggle Anomaly Detection on and set sensitivity:
SensitivityMeaningUse When
1.0Alert at 1 standard deviationVery sensitive
2.0Alert at 2 standard deviationsBalanced (recommended)
3.0Alert at 3 standard deviationsLess sensitive
Start with sensitivity 2.0. Adjust based on false positive rate.
7

Save Metric

Click Create to save the metric. The first capture will run immediately.

Viewing Metric History

Each metric tracks historical values and displays them as a trend chart:
  • Value line: Actual metric values over time
  • Anomaly band: Expected range (mean +/- sensitivity * stddev)
  • Anomaly points: Values outside the band are flagged

Reading the Chart

IndicatorMeaning
Green line within bandNormal values
Red dot outside bandAnomaly detected
Gray dashed linesUpper/lower bounds

Which Metric Type Should I Use?

Use Row Count Monitoring. It provides ML-based pattern learning, time-windowed counting, and explicit threshold support for row count monitoring.
Use null_percent on the column that shouldn’t have nulls.Example: Monitor customer_email for null percentage. Alert if nulls exceed historical baseline (e.g., jumps from 2% to 15%).
Use min_value and max_value on numeric columns.Example: Monitor price column. Alert if minimum drops below 0 (invalid) or maximum exceeds historical norms.
Use duplicate_count on columns that should be unique.Example: Monitor order_id for duplicates. Any duplicates indicate a data quality issue.
Use distinct_count on categorical columns.Example: Monitor country_code distinct count. A sudden increase might indicate invalid data.

Best Practices

Start with High-Impact Metrics

Focus on metrics that catch real problems: Critical table (orders):
  • Completeness: Catch data loss or duplication (see Row Count Monitoring)
  • null_percent on order_id: Should never be null
  • null_percent on customer_id: Should never be null
  • min_value on total_amount: Should never be negative

Match Capture Interval to Data Freshness

Data Update PatternRecommended Interval
Real-time streamingHourly
Hourly batch jobsHourly
Daily batch jobsDaily
Weekly aggregatesWeekly

Use Meaningful Sensitivity Values

ScenarioSensitivityRationale
New table, learning patterns3.0Reduce noise while learning
Established table, stable patterns2.0Balanced detection
Critical data, low tolerance1.5More sensitive alerting

Troubleshooting

Causes:
  • Metric was just created and hasn’t captured yet
  • Capture job failed
  • Table is empty
Solutions:
  1. Wait for the next scheduled capture (check interval)
  2. Trigger a manual capture: Actions > Capture Now
  3. Check the table has data
Causes:
  • Sensitivity is too low (too sensitive)
  • Normal data patterns are highly variable
  • Seasonality not accounted for
Solutions:
  1. Increase sensitivity (e.g., 2.0 to 3.0)
  2. Allow more baseline data to accumulate (30+ days)
  3. Consider if the variation is actually expected
Causes:
  • Sensitivity is too high (not sensitive enough)
  • Baseline includes anomalous data
  • Capture interval too infrequent
Solutions:
  1. Decrease sensitivity (e.g., 3.0 to 2.0)
  2. Reset baseline after fixing data issues
  3. Increase capture frequency
Causes:
  • Database connection issues
  • Column was renamed or removed
  • Permission changes
Solutions:
  1. Check data source connection status
  2. Verify column still exists
  3. Check database user permissions

What’s Next

Set Up Metric Alerts

Get notified when metrics detect anomalies

Metrics API

Automate metric management with the API

Report Badges

Embed metric status in dashboards

Alert Rules

Configure where alerts are sent