x
Exploration Hacking: Can LLMs Learn to Resist RL Training? — AI Alignment Forum