SMK — AI Alignment Forum

You might also find the following cases interesting (with self-locating uncertainty as an additional dimension), from this post.

Sleeping Newcomb-1. Some researchers, led by the infamous superintelligence Omega, are going to put you to sleep. During the two days that your sleep will last, they will briefly wake you up either once or twice, depending on the toss of a biased coin (Heads: once; Tails: twice). After each waking, they will put you back to sleep with a drug that makes you forget that waking. The weight of the coin is determined by what the superintelligence predicts that you would say when you are awakened and asked to what degree ought you believe that the outcome of the coin toss is Heads. Specifically, if the superintelligence predicted that you would have a degree of belief in Heads, then they will have weighted the coin such that the 'objective chance' of Heads is $p$ . So, when you are awakened, to what degree ought you believe that the outcome of the coin toss is Heads?

Sleeping Newcomb-2. Some researchers, led by the superintelligence Omega, are going to put you to sleep. During the two days that your sleep will last, they will briefly wake you up either once or twice, depending on the toss of a biased coin (Heads: once; Tails: twice). After each waking, they will put you back to sleep with a drug that makes you forget that waking. The weight of the coin is determined by what the superintelligence predicts your response would be when you are awakened and asked to what degree you ought to believe that the outcome of the coin toss is Heads. Specifically, if Omega predicted that you would have a degree of belief $p$ in Heads, then they will have weighted the coin such that the 'objective chance' of Heads is $1 - p$ . Then: when you are in fact awakened, to what degree ought you believe that the outcome of the coin toss is Heads?

FixDT

SMK2y00

Epistemic Constraint: The probability distribution which the agent settles on cannot be self-refuting according to the beliefs. It must be a fixed point of $b$ : a $p$ such that $b (p) = p$ .

Minor: there might be cases in which there is a fixed point $p$ , but where the agent doesn't literally converge or deliberate their way to it, right? (Because you are only looking for $b$ to satisfy the conditions of Brouwer/Kakutani, and not, say, Banach, right?) In other words, it might not always be accurate to say that the agent "settles on $p$ ". EDIT: oh, maybe you are just using "settles on" in the colloquial way.

FixDT

SMK2y*72

A common trope is for magic to work only when you believe in it. For example, in Harry Potter, you can only get to the magical train platform 9 3/4 if you believe that you can pass through the wall to get there.

Are you familiar with Greaves' (2013) epistemic decision theory? These types of cases are precisely the ones she considers, although she is entirely focused on the epistemic side of things. For example (p. 916):

Leap. Bob stands on the brink of a chasm, summoning up the courage to try and leap across it. Confidence helps him in such situations: specifically, for any value of between $0$ and $1$ , if Bob attempted to leap across the chasm while having degree of belief $x$ that he would succeed, his chance of success would then be $x$ . What credence in success is it epistemically rational for Bob to have?

And even more interesting cases (p. 917):

Embezzlement. One of Charlie’s colleagues is accused of embezzling funds. Charlie happens to have conclusive evidence that her colleague is guilty. She is to be interviewed by the disciplinary tribunal. But Charlie’s colleague has had an opportunity to randomize the content of several otherwise informative files (files, let us say, that the tribunal will want to examine if Charlie gives a damning testimony). Further, in so far as the colleague thinks that Charlie believes him guilty, he will have done so. Specifically, if $x$ is the colleague’s prediction for Charlie’s degree of belief that he’s guilty, then there is a chance $x$ that he has set in motion a process by which each proposition originally in the files is replaced by its own negation if a fair coin lands Heads, and is left unaltered if the coin lands Tails. The colleague is a very reliable predictor of Charlie’s doxastic states. After such randomization (if any occurred), Charlie has now read the files; they (now) purport to testify to the truth of $n$ propositions $P_{1}, \dots, P_{n}$ . Charlie’s credence in each of the propositions $P_{i},$ conditional on the proposition that the files have been randomized, is $1 / 2$ ; her credence in each $P_{i}$ conditional on the proposition that the files have not been randomized is $1$ . What credence is it epistemically rational for Charlie to have in the proposition $G$ that her colleague is guilty and in the propositions $P_{i}$ that the files purport to testify to the truth of?

In particular, Greaves' (2013, §8, pp. 43-49) epistemic version of Arntzenius' (2008) deliberational (causal) decision theory might be seen as a way of making sense of the first part of your theory. The idea, inspired by Skyrms (1990), is that deciding on a credence involves a cycle of calculating epistemic expected utility (measured by a proper scoring rule), adjusting credences, and recalculating utilities until an equilibrium is
obtained. For example, in Leap above, epistemic D(C)DT would find any credence permissible. And I guess that the second part of your theory serves as a way of breaking ties.

Invulnerable Incomplete Preferences: A Formal Statement

SMK2y*00

And, second, the agent will continually implement that plan, even if this makes it locally choose counter-preferentially at some future node.

Nitpick: IIRC, McClennen never talks about counter-preferential choice. Rather, that's Gauthier's (1997) approach to resoluteness.

as devised by Bryan Skyrms and Gerald Rothfus (cf Rothfus 2020b).

Found a typo: it is supposed to be Gerard. (It is also misspelt in the reference list.)

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments