What's the theory of impact for activation vectors? — AI Alignment Forum