Training fails to elicit subtle reasoning in current language models — AI Alignment Forum