Looking for row count monitoring? Use Row Count Monitoring for tracking row counts with ML-based anomaly detection or explicit thresholds.
Prerequisites: Before creating metrics, you need:
- A connected data source with discovery completed
- At least one asset (table/view) to monitor
customer_email column normally has ~3% null values. On Jan 30, null percentage jumped to 12.3%, well outside the expected range band. AnomalyArmor flags this as an anomaly, indicating a potential data quality issue in the source system.
Why Use Metrics
Freshness tells you when data was updated. Completeness tells you how much arrived. Metrics tell you what changed at the column level:| Issue | Freshness | Completeness | Metrics |
|---|---|---|---|
| ETL job failed completely | Detects it | Detects it | Detects it |
| ETL ran but loaded 0 rows | Might miss it | Catches it | N/A |
| Data loaded but 50% nulls | Misses it | Misses it | Catches it |
| Unexpected duplicates | Misses it | Misses it | Catches it |
| Values outside valid range | Misses it | Misses it | Catches it |
Metric Types
All metrics require a specific column to monitor:| Type | Description | Best For |
|---|---|---|
null_percent | Percentage of null values | Detecting missing data |
distinct_count | Count of unique values | Cardinality monitoring |
duplicate_count | Count of repeated values | Data quality checks |
min_value | Minimum numeric value | Range validation |
max_value | Maximum numeric value | Outlier detection |
avg_value | Average numeric value | Central tendency |
percentile | Nth percentile value | Distribution analysis |
Creating a Metric
Select Metric Type
Choose the type of metric you want to track:
- null_percent: Percentage of null values in a column
- distinct_count: Number of unique values
- duplicate_count: Number of duplicate values
- min/max/avg: Numeric range and central tendency
- percentile: Distribution analysis
Configure Capture Interval
Choose how often to capture the metric:
| Interval | Best For |
|---|---|
| Hourly | High-frequency data, real-time tables |
| Daily | Most batch ETL pipelines |
| Weekly | Slowly changing data |
Enable Anomaly Detection
Toggle Anomaly Detection on and set sensitivity:
| Sensitivity | Meaning | Use When |
|---|---|---|
| 1.0 | Alert at 1 standard deviation | Very sensitive |
| 2.0 | Alert at 2 standard deviations | Balanced (recommended) |
| 3.0 | Alert at 3 standard deviations | Less sensitive |
Viewing Metric History
Each metric tracks historical values and displays them as a trend chart:- Value line: Actual metric values over time
- Anomaly band: Expected range (mean +/- sensitivity * stddev)
- Anomaly points: Values outside the band are flagged
Reading the Chart
| Indicator | Meaning |
|---|---|
| Green line within band | Normal values |
| Red dot outside band | Anomaly detected |
| Gray dashed lines | Upper/lower bounds |
Which Metric Type Should I Use?
Is my table growing or shrinking unexpectedly?
Is my table growing or shrinking unexpectedly?
Use Row Count Monitoring. It provides ML-based pattern learning, time-windowed counting, and explicit threshold support for row count monitoring.
Are there unexpected null values?
Are there unexpected null values?
Use null_percent on the column that shouldn’t have nulls.Example: Monitor
customer_email for null percentage. Alert if nulls exceed historical baseline (e.g., jumps from 2% to 15%).Are values within expected range?
Are values within expected range?
Use min_value and max_value on numeric columns.Example: Monitor
price column. Alert if minimum drops below 0 (invalid) or maximum exceeds historical norms.Is data being duplicated?
Is data being duplicated?
Use duplicate_count on columns that should be unique.Example: Monitor
order_id for duplicates. Any duplicates indicate a data quality issue.How many unique values exist?
How many unique values exist?
Use distinct_count on categorical columns.Example: Monitor
country_code distinct count. A sudden increase might indicate invalid data.Best Practices
Start with High-Impact Metrics
Focus on metrics that catch real problems: Critical table (orders):- Completeness: Catch data loss or duplication (see Row Count Monitoring)
- null_percent on
order_id: Should never be null - null_percent on
customer_id: Should never be null - min_value on
total_amount: Should never be negative
Match Capture Interval to Data Freshness
| Data Update Pattern | Recommended Interval |
|---|---|
| Real-time streaming | Hourly |
| Hourly batch jobs | Hourly |
| Daily batch jobs | Daily |
| Weekly aggregates | Weekly |
Use Meaningful Sensitivity Values
| Scenario | Sensitivity | Rationale |
|---|---|---|
| New table, learning patterns | 3.0 | Reduce noise while learning |
| Established table, stable patterns | 2.0 | Balanced detection |
| Critical data, low tolerance | 1.5 | More sensitive alerting |
Troubleshooting
Metric shows 'No data'
Metric shows 'No data'
Causes:
- Metric was just created and hasn’t captured yet
- Capture job failed
- Table is empty
- Wait for the next scheduled capture (check interval)
- Trigger a manual capture: Actions > Capture Now
- Check the table has data
Too many false positive anomalies
Too many false positive anomalies
Causes:
- Sensitivity is too low (too sensitive)
- Normal data patterns are highly variable
- Seasonality not accounted for
- Increase sensitivity (e.g., 2.0 to 3.0)
- Allow more baseline data to accumulate (30+ days)
- Consider if the variation is actually expected
Missing real anomalies
Missing real anomalies
Causes:
- Sensitivity is too high (not sensitive enough)
- Baseline includes anomalous data
- Capture interval too infrequent
- Decrease sensitivity (e.g., 3.0 to 2.0)
- Reset baseline after fixing data issues
- Increase capture frequency
Metric capture failing
Metric capture failing
Causes:
- Database connection issues
- Column was renamed or removed
- Permission changes
- Check data source connection status
- Verify column still exists
- Check database user permissions
What’s Next
Set Up Metric Alerts
Get notified when metrics detect anomalies
Metrics API
Automate metric management with the API
Report Badges
Embed metric status in dashboards
Alert Rules
Configure where alerts are sent
