Confidence Is Not Oversight Triage
July 2, 2026. A short note on why average calibration does not tell us which individual agent actions deserve scarce human review.
Pablo Zavala · AI Safety Evaluation · Research Engineering
Notes on AI governance, evaluation, economics, and institutional design by Pablo Zavala.
July 2, 2026. A short note on why average calibration does not tell us which individual agent actions deserve scarce human review.