Decision Theory

More generally, the problem is that for formal agents, false antecedents cause nonsensical reasoning

No, it's *contradictory* assumptions. False but consistent assumptions are dual to consistent-and-true assumptions...so you can only infer a mutually consistent set of propositions from either.

To put it another way, a formal system has no way of knowing what would be true or false for reasons outside itself, so it has no way of reacting to a merely false statement. But a contradiction *is* definable within a formal system.

To.put it yet another way... contradiction in, contradiction out

21y

Yep, agreed. I used the language "false antecedents" mainly because I was
copying the language in the comment I replied to, but I really had in mind
"demonstrably false antecedents".

Repeated (and improved) Sleeping Beauty problem

This statement of the problem concedes that SB is calculating subjective probability. It should be obvious that subjective probabilities can diverge from each and objective probability -- that is what subjective means. It seems to me that the SB paradox is only a paradox if y ou try to do justice to objective and subjective probability in the same calculation.

24y

I'm confused, isn't the "objective probability" of heads 1/2 because that is the
probability of heads in the definition of the setup? The halver versus thirder
debate is about subjective probability, not objective probability, as far as I
can tell. I'm not sure why you are mentioning objective probability at all, it
does not appear to be relevant. (Though it is also possible that I do not know
what you mean by "objective probability".)

An aligned AI will also so what we want because it's also what it wants, its terminal values are also ours.

I've always taken "control" to differ from alignment in that it means an AI doing what we want even if it isn't what it wants, ie it has a terminal value of getting rewards, and our values are instrumental to that, if they figure at all.

And I take corrigibility to mean shaping an AIs values as you go along and therefore an outcome of control.