AI ALIGNMENT FORUM
AF

Wikitags

AI Success Models

Edited by plex last updated 17th Nov 2021

AI Success Models are proposed paths to an existential win via aligned AI. They are (so far) high level overviews and won't contain all the details, but present at least a sketch of what a full solution might look like. They can be contrasted with , which are stories about how AI might lead to major problems.

threat models
Subscribe
1
Subscribe
1
Discussion0
Discussion0
Posts tagged AI Success Models
28Solving the whole AGI control problem, version 0.0001
Steve Byrnes
4y
2
72An overview of 11 proposals for building safe advanced AI
Evan Hubinger
5y
32
43A positive case for how we might succeed at prosaic AI alignment
Evan Hubinger
4y
33
23Interpretability’s Alignment-Solving Potential: Analysis of 7 Scenarios
Evan R. Murphy
3y
0
50AI Safety "Success Stories"
Wei Dai
6y
11
46Four visions of Transformative AI success
Steve Byrnes
1y
11
34An Open Agency Architecture for Safe Transformative AI
davidad (David A. Dalrymple)
3y
18
28Conditioning Generative Models for Alignment
Arun Jose
3y
8
26Gradient Descent on the Human Brain
Arun Jose, gaspode
1y
0
11How Would an Utopia-Maximizer Look Like?
Thane Ruthenis
2y
9
23Towards Hodge-podge Alignment
Cleo Nardo
3y
3
18AI Safety via Luck
Arun Jose
2y
0
18Acceptability Verification: A Research Agenda
David Udell, Evan Hubinger
3y
0
13AI Safety Endgame Stories
Ivan Vendrov
3y
0
8What success looks like
Marius Hobbhahn, MaxRa, JasperGeh, Yannick_Muehlhaeuser
3y
0
Load More (15/19)
Add Posts