Sorted by New


AMA: Paul Christiano, alignment researcher

I wonder how valuable you find some of the more math/theory focused research directions in AI safety. I.e., how much less impactful do you find them, compared to your favorite directions? In particular,

  1. Vanessa Kosoy's learning-theoretic agenda, e.g., the recent sequence on infra-Bayesianism, or her work on traps in RL. Michael Cohen's research, e.g. the paper on imitation learning seems to go into a similar direction.
  2. The "causal incentives" agenda (link).
  3. Work on agent foundations, such as on cartesian frames. You have commented on MIRI's research in the past, but maybe you have an updated view.

I'd also be interested in suggestions for other impactful research directions/areas that are more theoretical and less ML-focused (expanding on adamShimi's question, I wonder which part of mathematics and statistics you expect to be particularly useful).