mtgarbbuy globally → sell AU · MTG + Pokémon

rigor

statistical accountability for the second-model · cream-ochre-black · 2026-05-16

no data yet

Phase-1 framework deployed; awaiting first metric writes from npm run score-predictions and scripts/check_drift.py.

Empty state expected until the second-model has accumulated outcomes. Tables read: model_drift_metrics, model_predictions × prediction_outcomes.

rolling accuracy per signal

(no metric rows yet)

REJECT triggers after 14 consecutive days of MCC < 0.10. sMAPE alert > 1.0; Brier alert > 0.30.

probability of backtest overfitting

(no pbo rows yet — write '<signal>:pbo' to model_drift_metrics)

feature-drift PSI (60-day window)

(no drift rows yet — run scripts/check_drift.py on a weekly cron)

calibration plots (predicted × realised)

(no resolved predictions yet — calibration appears once prediction_outcomes is populated)

framework: docs/predictions-prereg/rigor-framework.md · sidecar: python/mtgarb-validation/ · cron: scripts/daily.sh writes via npm run score-predictions at 04:00