This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Aligned AI Proposals
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Edit
History
Subscribe
Discussion
(0)
Help improve this page
Aligned AI Proposals
Random Tag
Contributors
Posts tagged
Aligned AI Proposals
Most Relevant
3
18
A "Bitter Lesson" Approach to Aligning AGI and ASI
Roger Dearnaley
3mo
0
2
19
How to Control an LLM's Behavior (why my P(DOOM) went down)
Roger Dearnaley
10mo
0
2
8
Goodbye, Shoggoth: The Stage, its Animatronics, & the Puppeteer – a New Metaphor
Roger Dearnaley
9mo
0
3
15
Requirements for a Basin of Attraction to Alignment
Roger Dearnaley
8mo
0
2
13
Interpreting the Learning of Deceit
Roger Dearnaley
10mo
2
1
56
A list of core AI safety problems and how I hope to solve them
davidad (David A. Dalrymple)
1y
12
1
46
AI Alignment Metastrategy
Vanessa Kosoy
9mo
1
1
5
an Evangelion dialogue explaining the QACI alignment plan
Tamsin Leake
1y
2
2
18
Safety First: safety before full alignment. The deontic sufficiency hypothesis.
Chipmonk
9mo
1
1
14
We have promising alignment plans with low taxes
Seth Herd
11mo
0
1
17
The (partial) fallacy of dumb superintelligence
Seth Herd
1y
0
1
33
An Open Agency Architecture for Safe Transformative AI
davidad (David A. Dalrymple)
2y
18
1
43
Scalable Oversight and Weak-to-Strong Generalization: Compatible approaches to the same problem
Ansh Radhakrishnan
,
Buck Shlegeris
,
Ryan Greenblatt
,
Fabien Roger
10mo
2
1
15
How to safely use an optimizer
Simon Fischer
6mo
0
1
5
An LLM-based “exemplary actor”
Roman Leventov
1y
0