The reward engineering problem — AI Alignment Forum