Obstacles to gradient hacking — AI Alignment Forum