Reward/value learning for reinforcement learning — AI Alignment Forum