Series

Reading the Residual Stream

2 posts

Part 1 · 7 Apr 2026 · 6 min read

Sparse Autoencoders Can't Measure Generation-Time Behavior. That's Not a Bug.

Why sycophancy SAE features have Cohen's d=9.9 but hallucination detection fails. The answer turned out to be deeper than measurement timing.

Part 2 · 11 Apr 2026 · 9 min read

SAE features can't isolate relations in Gemma-2-2B. I built a mutation-selection loop that can. The bottleneck was tokenization.