You get paged at 2 a.m.: an LLM-backed agent just approved a refund that violates policy. The postmortem shows the model skipped a constraint mid-generation and returned a confident but incorrect result. Your goal is narrow and technical: make the model allocate tokens to structured reasoning so multi-step checks stop collapsing into plausible nonsense. You …
