Malignancy in the prior seems like a strong crux of the goal-design part of alignment to me. Whether your prior is going to be used to model: * processes in the multiverse containing the AI which does said modeling, * processes which would output all of some blog so we...
(Epistemic status: I think this is right?) Alice is the CEO of ArmaggAI, and Bob is the CEO of BigModelsAI, two major AI capabilities organizations. They're racing to be the first to build a superintelligence aligned to their respective CEV which would take over the universe and satisfy their values....
this work was done by Tamsin Leake and Julia Persson at Orthogonal. thanks to mesaoptimizer for his help putting together this post. what does the QACI plan for formal-goal alignment actually look like when formalized as math? in this post, we'll be presenting our current formalization, which we believe has...
We recently announced [Orthogonal, an agent foundations alignment research organization. In this post, I give a thorough explanation of the formal-goal alignment framework, the motivation behind it, and the theory of change it fits in. The overall shape of what we're doing is: * Building a formal goal which would...