Do models say what they learn? — AI Alignment Forum