x
Activation additions in a small residual network — AI Alignment Forum