Evaluation
Held-out accuracy per prediction model. Pass when Aito's accuracy beats the baseline by โฅ 10 pp. Fail means Aito is honestly telling you it doesn't know โ even when the raw accuracy is high.
| Model | Features | Accuracy | Baseline | Gain | n | Verdict |
|---|