Reflection Mechanisms as an Alignment target: A follow-up survey
This is the second of three posts (part I) about surveying moral sentiments related to AI alignment. This work was done by Marius Hobbhahn and Eric Landgrebe under the supervision of Beth Barnes as part of the AI safety camp 2022. TL;DR: We find that the results of our first...
Oct 5, 202221