AI ALIGNMENT FORUM
AF

AI Alignment Writing Day 2018

Aug 13, 2019 by Ben Pace

On 10th July 2018, all attendees of the MIRI Summer Fellows Program were given an entire day to write blogposts to the AI Alignment Forum with ideas they'd been thinking about. These are the 28 posts that resulted, in chronological order.

5Choosing to Choose?
Daniel Herrmann
7y
1
4The Intentional Agency Experiment
Alexander Gietelink Oldenziel
7y
0
6Two agents can have the same source code and optimise different utility functions
Joar Skalse
7y
2
8Conditioning, Counterfactuals, Exploration, and Gears
Diffractor
7y
1
2Probability is fake, frequency is real
Linda Linsefors
7y
2
4Repeated (and improved) Sleeping Beauty problem
Linda Linsefors
7y
3
5Logical Uncertainty and Functional Decision Theory
swordsintoploughshares
7y
0
5A framework for thinking about wireheading
theotherotheralex
7y
0
30Bayesian Probability is for things that are Space-like Separated from You
Scott Garrabrant
7y
1
6A universal score for optimizers
levin
7y
5
5An environment for studying counterfactuals
Nisan
7y
6
19Mechanistic Transparency for Machine Learning
DanielFilan
7y
5
15Bounding Goodhart's Law
eric_langlois
7y
1
15A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment
AlexMennen
7y
0
18Dependent Type Theory and Zero-Shot Reasoning
evhub
7y
1
6Conceptual problems with utility functions
Dacyn
7y
0
5No, I won't go there, it feels like you're trying to Pascal-mug me
Rupert
7y
0
4Conditions under which misaligned subagents can (not) arise in classifiers
anon1
7y
2
21Complete Class: Consequentialist Foundations
abramdemski
7y
24
10Clarifying Consequentialists in the Solomonoff Prior
Vlad Mikulik
7y
13
6On the Role of Counterfactuals in Learning
Max Kanwal
7y
1
12Agents That Learn From Human Behavior Can't Learn Human Values That Humans Haven't Learned Yet
steven0461
7y
5
5Decision-theoretic problems and Theories; An (Incomplete) comparative list
somervta
7y
0
12 Mathematical Mindset
komponisto
7y
0
4Monk Treehouse: some problems defining simulation
dranorter
7y
0
8An Agent is a Worldline in Tegmark V
komponisto
7y
0
8Generalized Kelly betting
Linda Linsefors
7y
3
5Conceptual problems with utility functions, second attempt at explaining
Dacyn
7y
0