A hands-on lab showing how “improving” a single metric (AUC/accuracy/F1) can worsen real-world outcomes. Includes metric audits, slice checks, cost-sensitive evaluation, threshold tuning, and decision policies you can defend, so dashboards don’t quietly ship bad decisions.
-
Updated
Apr 26, 2026