Capital-of Is Not a Single SAE Feature. So I Built a Mutation Loop to Find What Is.
SAE features can't isolate relations in Gemma-2-2B. I built a mutation-selection loop that can. The bottleneck was tokenization.
Open post3 posts
SAE features can't isolate relations in Gemma-2-2B. I built a mutation-selection loop that can. The bottleneck was tokenization.
Open postWhy sycophancy SAE features have Cohen's d=9.9 but hallucination detection fails. The answer turned out to be deeper than measurement timing.
Open postSelf-evolving AI harnesses fail when they optimize a fixed evaluator. The biological model is right: what needs to evolve is the selection pressure, not just the genome.
Open post