AI ALIGNMENT FORUM
AF

Fabian Schimpf
Ω2110
Message
Dialogue
Subscribe

Website: schimpffabian.github.io

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Open Problems in Negative Side Effect Minimization
Fabian Schimpf3y00

Starting more restrictive seems sensible; this could be, as you say, learned away, or one could use human feedback to sign off on high-impact actions. The first problem reminds me of finding regions of attractions in nonlinear control where the ROA is explored without leaving the stable region. The second approach seems to hinge on humans being able to understand the implications of high-impact actions and the consequences of a baseline like inaction. There are probably also other alternatives that we have not yet considered. 



 

Reply
2Open Problems in Negative Side Effect Minimization
3y
2