This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Academic Papers
Edit
History
Subscribe
Discussion
(0)
Help improve this page (3 flags)
Edit
History
Subscribe
Discussion
(0)
Help improve this page (3 flags)
Academic Papers
Random Tag
Contributors
2
Kaj Sotala
Posts either linking to, or summarizing, formal papers published elsewhere.
Posts tagged
Academic Papers
Most Relevant
1
61
Some AI research areas and their relevance to existential safety
Andrew Critch
1y
37
2
48
2021 AI Alignment Literature Review and Charity Comparison
Larks
4mo
13
1
17
Formal Solution to the Inner Alignment Problem
michaelcohen
1y
123
1
20
Why is pseudo-alignment "worse" than other ways ML can fail to generalize?
Q
nostalgebraist
,
Evan Hubinger
2y
Q
8
1
25
How truthful is GPT-3? A benchmark for language models
Owain Evans
8mo
18
0
18
Learning preferences by looking at the world
Rohin Shah
3y
4
0
26
Human-AI Collaboration
Rohin Shah
3y
4
0
19
Learning biases and rewards simultaneously
Rohin Shah
3y
3
0
11
New paper: Corrigibility with Utility Preservation
Koen Holtman
3y
0
0
7
Implications of Quantum Computing for Artificial Intelligence Alignment Research
Jaime Sevilla
,
Pablo Antonio Moreno Casares
3y
3
0
10
New paper: The Incentives that Shape Behaviour
Ryan Carey
2y
3
1
5
Demanding and Designing Aligned Cognitive Architectures
Koen Holtman
5mo
5