x
Deception as the optimal: mesa-optimizers and inner alignment — AI Alignment Forum