This is a draft written by J. Dmitri Gallow, Senior Research Fellow at the Dianoia Institute of Philosophy at ACU, as part of the Center for AI Safety Philosophy Fellowship. This draft is meant to solicit feedback. Here is a PDF version of the draft.
Abstract
The thesis of instrumental convergence holds that a wide range of ends have common means: for instance, self preservation, desire preservation, self improvement, and resource acquisition. Bostrom (2014) contends that instrumental convergence gives us reason to think that ''the default outcome of the creation of machine superintelligence is existential catastrophe''. I use the tools of decision theory to investigate whether this thesis is true. I find that, even... (read 9852 more words →)
Thanks for the read and for the response.
>None of your models even include actions that are analogous to the convergent actions on that list.
I'm not entirely sure what you mean by "model", but from your use in the penultimate paragraph, I believe you're talking about a particular decision scenario Sia could find herself in. If so, then my goal wasn't to prove anything about a particular model, but rather to prove things about every model.
>The non-sequential theoretical model is irrelevant to instrumental convergence, because instrumental convergence is about putting yourself in a better position to pursue your goals later on.
Sure. I started with the easy cases to get the main ideas out.... (read more)