Aspiration-based Q-Learning — AI Alignment Forum