Recently, I had a conversation with someone from a math background, asking how they could get into AI safety research. Based on my own path from mathematics to AI alignment, I recommended the following sources. It may prove useful to others contemplating a similar change in career:

  • Superintelligence by Nick Bostrom. It condenses all the main arguments for the power and the risk of AI, and gives a framework in which to think of the challenges and possibilities.
  • Sutton and Barto's Book: Reinforcement Learning: An Introduction. This gives the very basics of what ML researchers actually do all day, and is important for understanding more advanced concepts. It gives (most of) the vocabulary to understand what ML and AI papers are talking about.
  • Gödel without too many tears. This is how I managed to really grok logic and the completeness/incompleteness theorems. Important for understanding many of MIRI's and LessWrong's approaches to AI and decision theory.
  • Safely Interruptible agents. It feels bad to recommend one of my own papers, but I think this is an excellent example of bouncing between ML concepts and alignment concepts to make some traditional systems interruptible (so that we can shut them down without them resisting the shutdown).
  • Alignment for Advanced Machine Learning Systems. Helps give an overall perspective on different alignment methods, and some understanding of MIRI's view on the subject (for a deeper understanding, I recommend diving into MIRI's or Eliezer's publications/writings).

You mileage may vary, but these are the sources that I would recommend. And I encourage you to post any sources you'd recommend, in the comments.

2 comments, sorted by Click to highlight new comments since: Today at 3:48 PM
New Comment

I guess I'd recommend the AGI safety fundamentals course:

On Stuart's list: I think this list might be suitable for some types of conceptual alignment research. But you'd certainly want to read more ML for other types of alignment research.