x
a sketch of how we might go about getting basins of corrigibility from RL — AI Alignment Forum