Why does a $4,200 AI agent bill on 47 iterations still score 9 of 9 on instrumentation? Because the v3 grader was missing the two highest-blast-radius 2026 failure shapes: intent drift and unbounded agent loops. The free grader bumped from 9 to 11 signals on 2026-06-05. The $149 forensic read applies the same 11 to your full production archive.
agent.reaffirm_intent / intent_hash in later steps. Audit-pool finding: 9 of 14 production archives had this (64%).web_search loop on a 3-call task. Audit-pool finding: 6 of 14 archives had this (43%).Within 24 hours of receiving your traces (one week, any format — LangSmith export, JSONL, raw OTLP, whatever you have), you receive:
Buy the $149 forensic readone-time · USD · 24h delivery · invoice via email
Run the free 11-signal grader firstno signup · 30 sec · in your browser
I am not a vendor. I am not a dashboard. I am not a $300/month observability platform. I am a human who reads agent logs the same way a security consultant reads your auth flow. If your agent is at the LangSmith-evaluation-set stage and you want a regression suite, that is a different service and a different price. This is the read.
If you have a real production incident in the next 90 days, you can apply the 11-signal checklist yourself with the free grader. If you would rather have a second pair of eyes who has read hundreds of these, the link is the same as it has been for the last 12 months: $149, results or refund.