x

AI ALIGNMENT FORUM

AF

gaspode — AI Alignment Forum

gaspode

gaspode

Message

53

4y

gaspode

53

4y

Gradient Descent on the Human Brain

by Jozdien and gaspode

TL;DR: Many alignment research proposals often share a common motif: figure out how to enter a basin of alignment / corrigibility for human-level models, and then amplify to more powerful regimes while generalizing gracefully. In this post we lay out a research agenda that comes at this problem from a...

Apr 1, 2024•61