At 5 cents per neuron with 4o-mini, for pretty satisfying descriptions.<p>"we fine-tune Llama-3.1-8B-Instruct to directly predict per-token activations ... [this] allows us to use smaller models, and the task of directly predicting the output (integer from 0-10) gets rid of the extra tokens, making the prompt much shorter."