Reward function learning: the value function — AI Alignment Forum