AI ALIGNMENT FORUM
AF

408
ML Alignment Theory Scholars Program Winter 2021

ML Alignment Theory Scholars Program Winter 2021

Dec 08, 2021 by evhub

In the past six weeks, the Stanford Existential Risks Initiative (SERI) has been running a work trial for the “ML Alignment Theory Scholars” (MATS) program. Our goal is to increase the number of people working on alignment theory, and to do this, we’re running a scholars program that provides mentorship, funding, and community to promising new alignment theorists. This program is run in partnership with Evan Hubinger, who has been providing all of the mentorship to each of the scholars for their work trial.

As the final phase of the work trial, each participant has taken a previous research artifact (usually an Alignment Forum post) and written a distillation and expansion of that post. The posts were picked by Evan and each participant signed up for one they were interested in. Within the next two weeks (12/7 - 12/17), we’ll be posting all of these posts to lesswrong and the alignment forum as part of a sequence, with a couple of posts going up each day. (There will be around 10-15 posts total.)

34ML Alignment Theory Program under Evan Hubinger
ozhang, evhub, Victor W
4y
2
25Theoretical Neuroscience For Alignment Theory
Cameron Berg
4y
7
10Introduction to inaccessible information
Ryan Kidd
4y
2
26Understanding Gradient Hacking
peterbarnett
4y
2
18Understanding and controlling auto-induced distributional shift
L Rudolf L
4y
3
23The Natural Abstraction Hypothesis: Implications and Evidence
CallumMcDougall
4y
3
10Should we rely on the speed prior for safety?
Marc Carauleanu
4y
4
12Motivations, Natural Selection, and Curriculum Engineering
Oliver Sourbut
4y
0
5Universality and the “Filter”
maggiehayes
4y
1
17Evidence Sets: Towards Inductive-Biases based Analysis of Prosaic AGI
bayesian_kitten
4y
1
17Disentangling Perspectives On Strategy-Stealing in AI Safety
shawnghu
4y
1
10Don't Influence the Influencers!
lhc
4y
0