I've been clarifying my own understanding of the alignment problem over the past few months, and wanted to share my first writeups with folks here in case they're useful: https://www.danieldewey.net/risk/ The site currently has 3 pages: 1. The case for risk: how deep learning could become very influential, training problems...
One problem in theoretical AI that sometimes comes up is the problem of finding ways for AI systems to model themselves, or at least to act well as if they had models of themselves. I can see how this is a problem for uncomputable agents like AIXI (though I think...
I'm not sure this is on-topic for this forum -- if it's too far from the forum's purpose, let me know and I'll take it down! I've recently published an introduction to research on superintelligence risk, with the aim of making it easier for students to get into this area....