Signal Over Noise

Signal Over Noise

Evals Are the Tests You're Not Writing

Justin Wilson's avatar
Justin Wilson
May 11, 2026
∙ Paid

The dominant QA strategy for production AI in 2026 is “I ran the demo three times and it worked.” Nobody says this out loud. It’s the default state of every system that didn’t get evals built in at the start, which is most of them, and the gap is widening every month.

This would be career-ending in any other domain. You wouldn’t ship a payment API withou…

User's avatar

Continue reading this post for free, courtesy of Justin Wilson.

Or purchase a paid subscription.
© 2026 Justin Wilson · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture