Stuart Armstrong

Stuart Armstrong's Comments

Breaking Oracles: superrationality and acausal trade

Hum - my approach here seems to have a similarity to your idea.

A Critique of Functional Decision Theory

I have to say, I find these criticisms a bit weak. Going through them:

III. FDT sometimes makes bizarre recommendations

I'd note that successfully navigating Parfit's hitchhiker also involve violating "Guaranteed Payoffs": you pay the driver at a time when there is no uncertainty, and where you get better utility from not doing so. So I don't think Guaranteed Payoffs is that sound a principle.

Your bomb example is a bit underdefined, since the predictor is predicting your actions AND giving you the prediction. If the predictor is simulating you and asking "would you go left after reading a prediction that you are going right", then you should go left; because, by the probabilities in the setup, you are almost certainly a simulation (this is kind of a "counterfactual Parfit hitchhiker" situation).

If the predictor doesn't simulate you, and you KNOW they said to go right, you are in a slightly different situation, and you should go right. This is akin to waking up in the middle of the Parfit hitchhiker experiment, when the driver has already decided to save you, and deciding whether to pay them.

IV. FDT fails to get the answer Y&S want in most instances of the core example that’s supposed to motivate it

This section is incorrect, I think. In this variant, the contents of the boxes are determined not by your decision algorithm, but by your nationality. And of course two-boxing is the right decision in that situation!

the case for one-boxing in Newcomb’s problem didn’t seem to stem from whether the Predictor was running a simulation of me, or just using some other way to predict what I’d do.

But it does depend on things like this. There's no point in one-boxing unless your one-boxing is connected with the predictor believing that you'd one-box. In a simulation, that's the case; in some other situations where the predictor looks at your algorithm, that's also the case. But if the predictor is predicting based on nationality, then you can freely two-box without changing the predictor's prediction.

V. Implausible discontinuities

There's nothing implausible about discontinuity in the optimal policy, even if the underlying data is continuous. If is the probability that we're in a smoking lesion vs a Newcomb problem, then as changes from to , the expected utility of one-boxing falls and the expected utility of two-boxing rises. At some point, the optimal action will jump discontinuously from one to the other.

VI. FDT is deeply indeterminate

I agree FDT is indeterminate, but I don't agree with your example. Your two calculators are clearly isomorphic, just as if we used a different numbering system for one versus the other. Talking about isomorphic algorithms avoids worrying about whether they're the "same" algorithm.

And in general, it seems to me, there’s no fact of the matter about which algorithm a physical process is implementing in the absence of a particular interpretation of the inputs and outputs of that physical process.

Indeed. But since you and your simulation are isomorphic, you can look at what the consequences are of you outputting "two-box" while your simulation outputs "deux boites" (or "one-box" and "une boite"). And {one-box, une boite} is better than {two-box, deux boites}.

But why did I use those particular interpretations of me and my simulation's physical processes? Because those interpretations are the ones relevant to the problem at hand. Me and my simulation will have a different weight, consume different amounts of power, are run at different times, and probably at different speeds. If those were relevant to the Newcomb problem, then the fact we are different becomes relevant. But since they aren't, we can focus in on the core of the matter. (you can also consider the example of playing the prisoner's dilemma against an almost-but-not-quite-identical copy of yourself).

2018 AI Alignment Literature Review and Charity Comparison

Very thorough, and it's very worthwhile that posts like this are made.

Bottle Caps Aren't Optimisers

It's helped me hone my thinking on what is and isn't an optimiser (and a wireheader, and so on, for associated concepts).

Load More