AQI Forecast Dashboard

Current AQI

Next Hour

Confidence:

Forecast Accuracy

Exact class match

Next-Hour Class Probability

Predicted vs Actual AQI

Predicted Actual

Select a city and click Run Forecast to load history

Window:

Fetching drift data from Athena…

Click the Data Drift tab to load global drift analysis

Scope:

Reference window:

Recent window:

|z| < 0.5 — Negligible |z| 0.5–1.0 — Moderate |z| 1.0–2.0 — Significant |z| > 2.0 — High

Feature Drift Score (z-score)

Prediction Distribution Shift

Reference Recent

Feature Statistics

Feature	Ref Mean	Recent Mean	Δ %	Drift Score (σ)

Accuracy Trend — Concept Drift Detection

Daily exact-match accuracy over last 8 weeks — all 99 cities pooled. Declining trend = concept drift.

⚠ Concept Drift Detected Stable

Last 7d

Prior 7d

WoW Δ

Loading accuracy trend…

Open the Data Drift tab to load the global accuracy trend

Window:

Computing…

Loading…

API Status

Model

features → AQI classes

Coverage

Cities monitored globally

Global Weighted F1

All 99 cities — no city filter

Live Production Metrics

Need ≥ 10 labelled rows in Athena

Weighted F1

Weighted Precision

TP / (TP + FP) per class

Weighted Recall

TP / (TP + FN) per class

Per-Class Breakdown

AQI Class	F1	Precision	Recall	Support

Data Freshness

Select a city and run forecast first

Drift Summary — Current vs Prior Week

Switch to the Data Drift tab to load drift analysis

Auto-refresh

Forecast + drift re-fetched automatically every hour

Next refresh in

Daily Retrain Workflow

1. Check drift daily

Open Data Drift tab. If any feature |z| > 1.0 (orange/red), retrain is recommended.

2. Rolling retrain (1 year)

python ml/train.py --lookback-weeks 52

3. Automated via cron (05:00 UTC)

bash ml/retrain_daily.sh

Loading feature importance…

Importance type

Feature Importance

All Types Comparison

Feature	Gain	Weight	Cover

Gain Trend how top feature influence shifts after each daily retrain

Trend available after 2+ retrains. Check back tomorrow.

Gain trend chart will appear after the second daily retrain (05:00 UTC).

How to Read

Gain — avg. reduction in loss when this feature is used in a split. Best proxy for predictive power.

Weight — how often this feature appears in splits across all trees. High = frequently used, not necessarily impactful.

Cover — avg. number of training samples affected by splits on this feature. High = broadly applicable splits.