LHC v0.2: A Benchmark for Long-Horizon Agent Coherence (and the Methodology That Got It Honest)
I just published LHC v0.2, an open benchmark for long-horizon coherence in 8B-class agent models, plus a deterministic parser baseline that puts a useful floor on what fine-tuning is worth for structured-state tasks. This post explains what they're for, how to use them, and the methodology arc that produced them across five rounds of external review.