name: dst-check-freshness description: Check data freshness and age for DST tables in DuckDB. Use when determining if data needs refreshing or validating data currency before analysis.
DST Check Freshness Skill
Purpose
Verify data freshness and determine if a refresh is needed. This is essential for ensuring your analysis uses current data and making informed decisions about when to re-fetch from DST.
When to Use
- Before starting analysis (ensure data is current)
- User asks about data age or currency
- Deciding whether to re-fetch data from DST
- Validating that analysis uses recent data
- After discovering data with dst-list-tables skill
- As part of automated data refresh workflows
How to Use
Basic Freshness Check
Check how old the data is:
python scripts/db/query_metadata.py --table-id <TABLE_ID> --check-freshness
Check Against Threshold
Determine if data is fresh or stale using a specific age threshold:
python scripts/db/query_metadata.py --table-id <TABLE_ID> --check-freshness --max-age-days <N>
Expected Output
The freshness check provides:
Without Threshold
======================================================================
FRESHNESS CHECK FOR TABLE: FOLK1A
======================================================================
Table Name: Population at the first day of the quarter
Last Updated: 2025-10-15T10:30:00
Fetch Timestamp: 2025-10-25T09:15:00
Data Age: 5 days ago (5 days)
======================================================================
With Threshold
======================================================================
FRESHNESS CHECK FOR TABLE: FOLK1A
======================================================================
Table Name: Population at the first day of the quarter
Last Updated: 2025-10-15T10:30:00
Fetch Timestamp: 2025-10-25T09:15:00
Data Age: 5 days ago (5 days)
Threshold: 30 days
Status: ✓ FRESH
→ Data is within acceptable age threshold
======================================================================
Stale Data Example
Status: ✗ STALE
→ Data exceeds age threshold - consider refreshing
Freshness Thresholds (Recommended)
Choose threshold based on how often DST updates the source data:
| Update Frequency | Recommended Threshold | Example Use Case |
|---|---|---|
| Daily | 1-2 days | Daily economic indicators |
| Weekly | 7 days | Weekly employment stats |
| Monthly | 30 days | Monthly population data |
| Quarterly | 90 days | Quarterly financial data |
| Annually | 365 days | Yearly census data |
Note: These are guidelines. Adjust based on your analysis requirements and tolerance for data lag.
Decision Guide
✓ FRESH Status
- Action: Proceed with analysis confidently
- Rationale: Data is within acceptable age
- Next Step: Continue with dst-query or table-summary
✗ STALE Status
- Action: Consider refreshing data
- Rationale: Data exceeds age threshold
- Next Step: Evaluate if staleness impacts your analysis
- Refresh: Use Fetcher Agent if refresh needed
Evaluation Questions
Does staleness matter for this analysis?
- Trends over time: Fresh data may not be critical
- Current snapshot: Fresh data is essential
What's the cost of using stale data?
- Low impact: Proceed with disclaimer
- High impact: Refresh immediately
How expensive is refresh?
- Small table: Refresh quickly
- Large table: Evaluate necessity first
Interpreting Results
Last Updated vs Fetch Timestamp
- Last Updated: When DST updated the source data
- Fetch Timestamp: When we downloaded it locally
Important: If last_updated is recent but fetch_timestamp is old, the source data has changed and a refresh will get newer data.
Age Calculation
- "5 days ago": Simple, human-readable
- (5 days): Exact day count for threshold comparison
- Just now: Recently fetched (< 1 minute)
- X hours/minutes ago: Very recent
- X months ago: Quite old
- X years ago: Very stale
Next Steps
If Fresh → Continue Analysis
# Proceed with query or analysis
python scripts/db/table_summary.py --table-id <TABLE_ID>
python scripts/db/query_data.py --sql "SELECT * FROM dst_<table_id> LIMIT 10"
If Stale → Inform User
Present findings to user:
- Data age
- Threshold used
- Recommendation (refresh or proceed)
If Refresh Needed → Switch to Fetcher
python scripts/fetch_and_store.py --table-id <TABLE_ID> --overwrite
If Acceptable → Proceed with Disclaimer
Continue analysis but note data age in results:
- "Analysis based on data from [date]"
- "Data is X days old as of [today]"
Examples
Example 1: Basic freshness check
python scripts/db/query_metadata.py --table-id FOLK1A --check-freshness
Example 2: Check with 30-day threshold
python scripts/db/query_metadata.py --table-id FOLK1A --check-freshness --max-age-days 30
Example 3: Strict 7-day threshold for daily data
python scripts/db/query_metadata.py --table-id DAILY_STATS --check-freshness --max-age-days 7
Example 4: Lenient 90-day threshold for quarterly data
python scripts/db/query_metadata.py --table-id QUARTERLY_FIN --check-freshness --max-age-days 90
Tips
Before Important Analysis
- Always check freshness before critical analyses
- Document data age in your results
- Consider impact of staleness on conclusions
Update Frequency Awareness
- Research DST's update schedule for specific tables
- Some tables update daily, others annually
- Match threshold to update frequency
Automated Workflows
- Check freshness before scheduled analyses
- Auto-refresh if stale
- Log refresh decisions
User Communication
- Inform users about data age
- Explain thresholds used
- Recommend actions based on results
Common Workflows
Workflow 1: Pre-Analysis Check
# 1. List available data
python scripts/db/query_metadata.py --list-all
# 2. Check freshness
python scripts/db/query_metadata.py --table-id FOLK1A --check-freshness --max-age-days 30
# 3. If fresh → analyze
# 4. If stale → refresh first
Workflow 2: Conditional Refresh
# 1. Check freshness
python scripts/db/query_metadata.py --table-id FOLK1A --check-freshness --max-age-days 7
# 2. If stale → refresh
python scripts/fetch_and_store.py --table-id FOLK1A --overwrite
# 3. Proceed with fresh data
Workflow 3: Multiple Tables
# Check freshness of all tables needed for analysis
python scripts/db/query_metadata.py --table-id TABLE1 --check-freshness --max-age-days 30
python scripts/db/query_metadata.py --table-id TABLE2 --check-freshness --max-age-days 30
# Refresh any stale tables
# Then proceed with analysis
Troubleshooting
"No fetch timestamp available"
- Table metadata may be incomplete
- Check metadata with dst-list-tables
- Re-fetch data to populate timestamp
Unexpected Age
- Verify system clock is correct
- Check timestamp format in metadata
- Investigate if manual changes were made
Always Shows Stale
- Threshold may be too strict
- Adjust --max-age-days parameter
- Consider DST's actual update frequency
Best Practices
- Check freshness early in your workflow
- Use appropriate thresholds for data type
- Document data age in analysis results
- Automate freshness checks for regular analyses
- Balance refresh cost vs staleness impact
- Communicate clearly with users about data currency