An unaligned benchmark — AI Alignment Forum