mtgarbbuy globally → sell AU · MTG + Pokémon

v1 vs v2 — side-by-side

Operator gut-check. v1 = existing engine (point-estimate ROI rank); v2 = inference rebuild (tiered: deterministic buylist + Bayesian regression log-price posterior). Where they disagree most. Read large absolute deltas as "v2 thinks v1's $X estimate is off by Δ". Shadow-window discipline retired 2026-05-16 (operator amendment) — p50 + Δ visible again; width column kept as calibration health metric.

game:allmagicpokemonsort:abs Δv2 P(profit)Tier 1 firstv2 E[U]
pokemon · scan_run bca25adb· started 2026-05-16 20:07· v1 opps: 19 · v2 preds: 38· model: v1.1-pokemon-tier3-2026-05-16T0423Z
Auditslatest per (agent, slice). Operator-evidence only — engine + portfolio do not read.
agentslicerecommendationfindingsn_predn_pairedconfwhen
comparable-sales-enumeratorprediction 1aaccd4dPROCEED_DEEP_DIVE121200.8214h ago
Summary: 6 high-similarity (>=0.85) AU sold comparables for Fifth Dawn #114 nonfoil over 180d, ranging A$24.18 to A$47.12. Top-5 by similarity median ~A$45.53 (range A$24.18-A$47.12). Two LP-condition exact matches around A$24-30 (881d50a9, bda00fe0), three LP/MP exact matches A$45-47. Operator landed cost A$31.63 sits between the soft-floor LP cluster and the harder MP/no-#-but-named-set cluster. Cross-printing references cluster A$36-52 and support a A$40+ ceiling. EU CM EUR 16.57 (~A$27.34 floor), TCG USD 30.67 (~A$47.54 ceiling). 17 cross-printing rows and 1 SLD foil / 1 signed / 1 proxy excluded - verdict: deep dive justified.
anomaly-watchermagic tier=0 1dINVESTIGATE_AND_DERATE(high)3000.9014h ago
Summary: RED. Three HIGH-severity findings. (1) eBay AU listings 20% of 7d baseline (58 vs 294) - likely scraper outage or missing EBAY_APP_ID/CERT on VPS. (2)(3) Pokemon foil cm_eur 100% NULL for pre-2014 AND 2014-2020 eras (n=243 combined predictions) - matches documented foil-column inversion bug, landed costs understated 3-5x. Operator next: SSH to VPS to verify EBAY_APP_ID/CERT and re-run eBay scrape; concurrently derate or suppress Pokemon foil picks in pre-2020 eras until column-mapping fix lands.
calibration-auditormagic tier=3 90dINVESTIGATE_AND_DERATE(high)74413225740.6514h ago
Summary: Model exhibits two simultaneous systematic biases: (a) severe undershoot in the price_tier:300_plus slice (median +158% residual, coverage 22.7%) where the model both under-predicts AND has bands far too narrow — high-confidence evidence of pairing/cross-printing contamination in the source data, mirrored by the Mystic Remora alternating-realised pattern in recent_samples; (b) consistent overshoot across modern_2003_2014, recent_2015_2020, and sub-A$100 price tiers (-27% to -55% median residuals) suggesting expensive comparables are leaking into cheap-printing predictions. The prob_profit calibration line is non-monotonic (decile [0.2-0.3) collapses to 4.6% actual vs 26.8% predicted) and unsuitable for position sizing. Vintage_pre_2003 (n=10,032) is near-balanced (-8.3%) and the dominant slice masking issues in aggregate. Recommend INVESTIGATE_AND_DERATE: investigate pairing/printing-resolution in the realised-outcome join (cross-printing leak is the leading hypothesis per CLAUDE.md priors), and operator-side derate Tier 3 picks in price_tier:300_plus and the modern_2003_2014 era until investigation completes. No retraining or prior changes per pre-reg shadow integrity.
#cardsetfolv1 sellv1 profitv1 roiv1 velv2 p25v2 p50v2 p75v2 widthv2 P(profit)v2 E[U]T1Δ p50−v1Review
2Latias & Latios-GXsm9$2355$14154%weak$41$101$248$20715.3%-861·$-2254
1Latias & Latios-GXsm9$2355$14154%weak$41$101$248$20715.3%-861·$-2254
3Gengar & Mimikyu-GXsm9·$1179$6413%medium$7$17$41$341.8%-652·$-1162
4Gengar & Mimikyu-GXsm9·$1179$6413%medium$7$17$41$341.8%-652·$-1162
6Arceus & Dialga & Palkia-GXsm12$820$4142%medium$26$63$155$12917.3%-473·$-757
5Arceus & Dialga & Palkia-GXsm12$820$4142%medium$26$63$155$12917.3%-473·$-757
9Rocket's Zapdos exex7$658$3402%weak$36$87$215$17930.6%-321·$-571
10Rocket's Zapdos exex7$658$3402%weak$36$87$215$17930.6%-321·$-571
7Deoxys-EXbw9$497$2974%medium$27$67$165$13741.7%-138·$-429
8Deoxys-EXbw9$497$2974%medium$27$67$165$13741.7%-138·$-429
17Gengar-EXxy4$441$2042%weak$28$69$170$14229.7%-265·$-372
18Gengar-EXxy4$441$2042%weak$28$69$170$14229.7%-265·$-372
13M Gengar-EXxy4$431$2443%weak$26$63$153$12839.3%-146·$-368
14M Gengar-EXxy4$431$2443%weak$26$63$153$12839.3%-146·$-368
20Gengar & Mimikyu-GXsm9$376$1932%weak$23$57$140$11733.9%-177·$-319
19Gengar & Mimikyu-GXsm9$376$1932%weak$23$57$140$11733.9%-177·$-319
36Team Magma's Groudon-EXdc1$383$1541%weak$27$66$163$13627.5%-284·$-317
35Team Magma's Groudon-EXdc1$383$1541%weak$27$66$163$13627.5%-284·$-317
21Black Kyurem-EXbw7$308$1813%weak$24$58$143$11950.1%-72·$-249
22Black Kyurem-EXbw7$308$1813%weak$24$58$143$11950.1%-72·$-249
25Slowpoke & Psyduck-GXsm11$299$1763%weak$21$51$124$10446.9%-77·$-248
26Slowpoke & Psyduck-GXsm11$299$1763%weak$21$51$124$10446.9%-77·$-248
15Greninja & Zoroark-GXsm10$296$1794%weak$20$50$123$10249.9%-63·$-246
16Greninja & Zoroark-GXsm10$296$1794%weak$20$50$123$10249.9%-63·$-246
29Solgaleo & Lunala-GXsm12$290$1342%weak$20$49$121$10131.9%-169·$-241
30Solgaleo & Lunala-GXsm12$290$1342%weak$20$49$121$10131.9%-169·$-241
27Light Dragoniteneo4$296$1743%weak$28$67$165$13855.6%-56·$-229
28Light Dragoniteneo4$296$1743%weak$28$67$165$13855.6%-56·$-229
23Armored Mewtwosmp$259$1483%weak$21$52$128$10749.1%-69·$-207
24Armored Mewtwosmp$259$1483%weak$21$52$128$10749.1%-69·$-207
31Rocket's Articuno exex7$248$1312%weak$26$63$154$12850.6%-75·$-185
32Rocket's Articuno exex7$248$1312%weak$26$63$154$12850.6%-75·$-185
11Primal Kyogre-EXxy5$191$1042%strong$17$41$101$8446.9%-63·$-150
12Primal Kyogre-EXxy5$191$1042%strong$17$41$101$8446.9%-63·$-150
38Dark Raichubase5$143$511%strong$27$66$160$13445.8%-106·$-77
37Dark Raichubase5$143$511%strong$27$66$160$13445.8%-106·$-77
33Gliscor LV.Xdp6$116$753%medium$19$46$112$9467.4%-6·$-70
34Gliscor LV.Xdp6$116$753%medium$19$46$112$9467.4%-6·$-70

Δ p50−v1: large negative = v2 thinks v1 over-estimates the realised AU price. Highlight thresholds: red >30% (or >A$5); ochre >15% (or >A$2); green within band. v2 width column (p75 − p25): wider = model less confident; red >A$20 (or 60% of v1). Review column: latest verdict from any specialist agent in .claude/agents/; hover for agent name + confidence + revised band. Operator-evidence (engine free to read post-shadow).