AI ALIGNMENT FORUMTags
AF

AI Success Models

EditHistorySubscribe
Discussion (0)
Help improve this page
EditHistorySubscribe
Discussion (0)
Help improve this page
AI Success Models
Random Tag
Contributors
3plex

AI Success Models are proposed paths to an existential win via aligned AI. They are (so far) high level overviews and won't contain all the details, but present at least a sketch of what a full solution might look like. They can be contrasted with threat models, which are stories about how AI might lead to major problems.

Posts tagged AI Success Models
0
28Solving the whole AGI control problem, version 0.0001
Steve Byrnes
2y
2
0
68An overview of 11 proposals for building safe advanced AI
Evan Hubinger
3y
25
0
42A positive case for how we might succeed at prosaic AI alignment
Evan Hubinger
2y
25
2
21Interpretability’s Alignment-Solving Potential: Analysis of 7 Scenarios
Evan R. Murphy
1y
0
0
45AI Safety "Success Stories"
Wei Dai
4y
11
1
29An Open Agency Architecture for Safe Transformative AI
davidad (David A. Dalrymple)
9mo
18
1
24Conditioning Generative Models for Alignment
Arun Jose
1y
8
0
13formal alignment: what it is, and some proposals
Tamsin Leake
8mo
0
0
19Towards Hodge-podge Alignment
Cleo Nardo
9mo
3
1
14AI Safety via Luck
Arun Jose
6mo
0
0
18Acceptability Verification: A Research Agenda
David Udell, Evan Hubinger
1y
0
1
13AI Safety Endgame Stories
Ivan Vendrov
1y
0
0
8What success looks like
Marius Hobbhahn, MaxRa, JasperGeh, Yannick_Muehlhaeuser
1y
0
1
4Introduction to the sequence: Interpretability Research for the Most Important Century
Evan R. Murphy
1y
0
0
7Making it harder for an AGI to "trick" us, with STVs
Tor Økland Barstad
1y
0
Load More (15/17)
Add Posts