AI ALIGNMENT FORUM
AF

183
CAIS Philosophy Fellowship Midpoint Deliverables

CAIS Philosophy Fellowship Midpoint Deliverables

Jun 07, 2023 by Dan H

Conceptual AI safety researchers aim to help orient the broader field of AI safety, but in doing so, they must wrestle with imprecise, nebulous, hard-to-define problems. Philosophers specialize in dealing with problems like these. The CAIS Philosophy supports PhD students, postdocs, and professors of philosophy to produce novel conceptual AI safety research.

This sequence is a collection of drafts written by the CAIS Philosophy Fellows meant to elicit feedback. 

11Instrumental Convergence? [Draft]
J. Dmitri Gallow
2y
4
14The Polarity Problem [Draft]
Dan H, cdkg, Simon Goldstein
2y
3
13Shutdown-Seeking AI
Simon Goldstein
2y
5
12Is Deontological AI Safe? [Feedback Draft]
Dan H, William D'Alessandro
2y
4
39There are no coherence theorems
Dan H, EJT
3y
16
11Aggregating Utilities for Corrigible AI [Feedback Draft]
Dan H, Simon Goldstein
2y
3
17AI Will Not Want to Self-Improve
petersalib
2y
8
4Group Prioritarianism: Why AI Should Not Replace Humanity [draft]
fsh
2y
0
15Language Agents Reduce the Risk of Existential Catastrophe
cdkg, Simon Goldstein
2y
1