Skip to main content
Row Count Monitoring tracks row counts in your tables over time. It detects when data volumes drop unexpectedly (data loss) or spike unusually (duplicate loads), helping you catch ETL issues before they impact downstream consumers.
Why Row Count? Row count monitoring used to be part of Data Quality Metrics. We moved it to its own feature with enhanced capabilities: ML-based pattern learning, time-windowed counting, and explicit threshold support.
Example scenario: The orders table typically receives 45,000-55,000 rows daily. On Jan 30, only 15,234 rows were loaded, a 70% drop indicating a potential ETL failure.

Two Monitoring Modes

Row Count Monitoring offers two approaches to fit different needs: Let AnomalyArmor learn your table’s normal row count patterns:
AspectHow It Works
Learning periodCollects data for 7+ days to establish baseline
Pattern detectionIdentifies daily, weekly, and seasonal trends
Anomaly detectionUses statistical analysis (mean +/- stddev * sensitivity)
Best forTables with consistent, predictable patterns
Auto-learn example (orders table):

Day 1-7:    Learning... collecting baseline data
Day 8+:     Baseline established (avg: 48,000, stddev: 3,200)
            Alerts if row count deviates significantly

Explicit Mode

Set specific row count thresholds when you know exactly what to expect:
SettingDescription
Min rowsAlert if row count falls below this value
Max rowsAlert if row count exceeds this value
Best forTables with known, fixed expectations
Explicit example (daily_summary table):

Expected: Exactly 1 row per day
Min: 1, Max: 1
Alert if row count != 1

Time-Windowed Counting

For tables that accumulate data over time, use a timestamp column to count rows within a specific window:
WindowUse Case
1 hourReal-time event streams
6 hoursFrequent batch loads
12 hoursTwice-daily pipelines
24 hoursDaily batch ETL (most common)
168 hoursWeekly aggregates
Time-windowed counting (orders table with created_at):

Without time window:  COUNT(*) = 5,000,000 (all time)
With 24h window:      COUNT(*) WHERE created_at >= now() - 24h = 48,000
Use time-windowed counting for append-only tables. Without it, row counts only grow, making anomaly detection less useful.

Setting Up Row Count Monitoring

1

Navigate to the Asset

Go to Assets and select the table you want to monitor.
2

Open Data Quality Tab

Click the Data Quality tab on the asset detail page, then scroll to the Row Count Monitoring section.
3

Create Schedule

Click Create Schedule and configure:
  • Table: Select the table to monitor
  • Timestamp column: (Optional) For time-windowed counting
  • Time window: How far back to count rows
  • Check interval: How often to check (1h, 6h, 12h, 24h)
4

Choose Monitoring Mode

Select your monitoring approach:Auto-Learn Mode:
  • Toggle Auto-learn on
  • Set Sensitivity (1-4, lower = more sensitive)
  • Wait for learning period to complete
Explicit Mode:
  • Toggle Auto-learn off
  • Set Expected min rows
  • Set Expected max rows
5

Save and Monitor

Click Create. The first check runs immediately, then continues on your configured interval.

Understanding Results

Status Indicators

StatusMeaningAction
HealthyRow count within expected rangeNone needed
AnomalyRow count outside expected rangeInvestigate the cause
LearningCollecting baseline dataWait for learning to complete
No DataNo checks have run yetCheck will run on next interval

Anomaly Types

AnomalyPossible Causes
Row count too lowETL failure, data loss, filter bug, source issue
Row count too highDuplicate load, removed filter, upstream spike
Row count zeroComplete ETL failure, wrong table, permissions

Best Practices

Choose the Right Mode

ScenarioRecommended Mode
Data patterns vary naturallyAuto-learn with sensitivity 2-3
Exact expectations knownExplicit with min/max thresholds
New table, unknown patternsAuto-learn with sensitivity 3-4
Critical data, low toleranceAuto-learn with sensitivity 1-2

Set Appropriate Windows

Data PatternRecommended Window
Real-time streaming1 hour
Hourly batch jobs6 hours
Daily batch jobs24 hours
Weekly aggregates168 hours

Start Conservative, Then Tighten

  1. Week 1: Use auto-learn with sensitivity 3 (less sensitive)
  2. Week 2-4: Review any anomalies, adjust if too noisy
  3. Month 2+: Tighten to sensitivity 2 once patterns are stable

Row Count vs. Metrics

FeatureRow CountData Quality Metrics
PurposeMonitor row countsMonitor column statistics
ScopeTable-levelColumn-level
ML-basedYes (auto-learn)Yes (anomaly detection)
Time windowsYesNo
Explicit thresholdsYesVia checks
Use Row Count Monitoring for: “Did the right amount of data arrive?” Use Metrics for: “Is the data quality correct?” (nulls, duplicates, ranges)

Troubleshooting

Causes:
  • Not enough data points collected yet
  • Check interval is very long (weekly)
Solutions:
  1. Wait for at least 7 data points (7 days for daily checks)
  2. Consider switching to explicit mode if you know expected values
Causes:
  • Sensitivity is too low (too sensitive)
  • Natural data variation is high
  • Seasonality not yet learned
Solutions:
  1. Increase sensitivity (e.g., 2 to 3)
  2. Allow more baseline data (30+ days)
  3. Switch to explicit mode with wider thresholds
Causes:
  • Sensitivity is too high (not sensitive enough)
  • Baseline includes anomalous data
Solutions:
  1. Decrease sensitivity (e.g., 3 to 2)
  2. Switch to explicit mode with tighter thresholds
Causes:
  • Timestamp column has no recent data
  • Wrong timestamp column selected
  • Time window too narrow
Solutions:
  1. Verify timestamp column has data in the window
  2. Check column data type (should be timestamp/datetime)
  3. Widen the time window

What’s Next

Set Up Alerts

Get notified when row count anomalies are detected

Data Quality Metrics

Monitor column-level statistics like null percentages

Freshness Monitoring

Track when data was last updated

Report Badges

Embed row count status in dashboards