Why Row Count? Row count monitoring used to be part of Data Quality Metrics. We moved it to its own feature with enhanced capabilities: ML-based pattern learning, time-windowed counting, and explicit threshold support.
Two Monitoring Modes
Row Count Monitoring offers two approaches to fit different needs:Auto-Learn Mode (Recommended)
Let AnomalyArmor learn your table’s normal row count patterns:| Aspect | How It Works |
|---|---|
| Learning period | Collects data for 7+ days to establish baseline |
| Pattern detection | Identifies daily, weekly, and seasonal trends |
| Anomaly detection | Uses statistical analysis (mean +/- stddev * sensitivity) |
| Best for | Tables with consistent, predictable patterns |
Explicit Mode
Set specific row count thresholds when you know exactly what to expect:| Setting | Description |
|---|---|
| Min rows | Alert if row count falls below this value |
| Max rows | Alert if row count exceeds this value |
| Best for | Tables with known, fixed expectations |
Time-Windowed Counting
For tables that accumulate data over time, use a timestamp column to count rows within a specific window:| Window | Use Case |
|---|---|
| 1 hour | Real-time event streams |
| 6 hours | Frequent batch loads |
| 12 hours | Twice-daily pipelines |
| 24 hours | Daily batch ETL (most common) |
| 168 hours | Weekly aggregates |
Setting Up Row Count Monitoring
Open Data Quality Tab
Click the Data Quality tab on the asset detail page, then scroll to the Row Count Monitoring section.
Create Schedule
Click Create Schedule and configure:
- Table: Select the table to monitor
- Timestamp column: (Optional) For time-windowed counting
- Time window: How far back to count rows
- Check interval: How often to check (1h, 6h, 12h, 24h)
Choose Monitoring Mode
Select your monitoring approach:Auto-Learn Mode:
- Toggle Auto-learn on
- Set Sensitivity (1-4, lower = more sensitive)
- Wait for learning period to complete
- Toggle Auto-learn off
- Set Expected min rows
- Set Expected max rows
Understanding Results
Status Indicators
| Status | Meaning | Action |
|---|---|---|
| Healthy | Row count within expected range | None needed |
| Anomaly | Row count outside expected range | Investigate the cause |
| Learning | Collecting baseline data | Wait for learning to complete |
| No Data | No checks have run yet | Check will run on next interval |
Anomaly Types
| Anomaly | Possible Causes |
|---|---|
| Row count too low | ETL failure, data loss, filter bug, source issue |
| Row count too high | Duplicate load, removed filter, upstream spike |
| Row count zero | Complete ETL failure, wrong table, permissions |
Best Practices
Choose the Right Mode
| Scenario | Recommended Mode |
|---|---|
| Data patterns vary naturally | Auto-learn with sensitivity 2-3 |
| Exact expectations known | Explicit with min/max thresholds |
| New table, unknown patterns | Auto-learn with sensitivity 3-4 |
| Critical data, low tolerance | Auto-learn with sensitivity 1-2 |
Set Appropriate Windows
| Data Pattern | Recommended Window |
|---|---|
| Real-time streaming | 1 hour |
| Hourly batch jobs | 6 hours |
| Daily batch jobs | 24 hours |
| Weekly aggregates | 168 hours |
Start Conservative, Then Tighten
- Week 1: Use auto-learn with sensitivity 3 (less sensitive)
- Week 2-4: Review any anomalies, adjust if too noisy
- Month 2+: Tighten to sensitivity 2 once patterns are stable
Row Count vs. Metrics
| Feature | Row Count | Data Quality Metrics |
|---|---|---|
| Purpose | Monitor row counts | Monitor column statistics |
| Scope | Table-level | Column-level |
| ML-based | Yes (auto-learn) | Yes (anomaly detection) |
| Time windows | Yes | No |
| Explicit thresholds | Yes | Via checks |
Troubleshooting
Status shows 'Learning' for too long
Status shows 'Learning' for too long
Causes:
- Not enough data points collected yet
- Check interval is very long (weekly)
- Wait for at least 7 data points (7 days for daily checks)
- Consider switching to explicit mode if you know expected values
Too many false positive anomalies
Too many false positive anomalies
Causes:
- Sensitivity is too low (too sensitive)
- Natural data variation is high
- Seasonality not yet learned
- Increase sensitivity (e.g., 2 to 3)
- Allow more baseline data (30+ days)
- Switch to explicit mode with wider thresholds
Missing real anomalies
Missing real anomalies
Causes:
- Sensitivity is too high (not sensitive enough)
- Baseline includes anomalous data
- Decrease sensitivity (e.g., 3 to 2)
- Switch to explicit mode with tighter thresholds
Row count always zero with time window
Row count always zero with time window
Causes:
- Timestamp column has no recent data
- Wrong timestamp column selected
- Time window too narrow
- Verify timestamp column has data in the window
- Check column data type (should be timestamp/datetime)
- Widen the time window
What’s Next
Set Up Alerts
Get notified when row count anomalies are detected
Data Quality Metrics
Monitor column-level statistics like null percentages
Freshness Monitoring
Track when data was last updated
Report Badges
Embed row count status in dashboards
