AI ALIGNMENT FORUM
AF

1792
AGI-assisted Alignment

AGI-assisted Alignment

Jul 09, 2022 by Tor Økland Barstad

Can we start out with an unaligned superintelligent AGI, and end up with an aligned AGI-system? I argue maybe, and discuss principles, techniques and strategies that may enable us to do so.

One reason for exploring such strategies is contingency planning (what if we haven’t solved alignment by the time the first superintelligent AGI arrives?). Another reason is that additional layers of security could be beneficial (even if we think we have solved alignment, are there ways to relatively quickly add additional layers of alignment-assurance?).

This is an ongoing series.

7Getting from an unaligned AGI to an aligned AGI?
Tor Økland Barstad
3y
0
7Making it harder for an AGI to "trick" us, with STVs
Tor Økland Barstad
3y
0
7Alignment with argument-networks and assessment-predictions
Tor Økland Barstad
3y
0