Charbel-Raphaël Segerie and Épiphanie Gédéon contributed equally to this post.
Many thanks to Davidad, Gabriel Alfour, Jérémy Andréoletti, Lucie Philippon, Vladimir Ivanov, Alexandre Variengien, Angélina Gentaz, Simon Cosson, Léo Dana and Diego Dorn for useful feedback.
TLDR: We present a new method for a safer-by design AI development. We think using plainly coded AIs may be feasible in the near future and may be safe. We also present a prototype and research ideas on Manifund.
Epistemic status: Armchair reasoning style. We think the method we are proposing is interesting and could yield very positive outcomes (even though it is still speculative), but we are less sure about which safety policy would use it in the... (read 3684 more words →)