Benchmark for successful concept extrapolation/avoiding goal misgeneralization — AI Alignment Forum