Kolmogorov complexity makes reward learning worse — AI Alignment Forum