# The Unexpected Clanging

2 min read3 comments

# 9

There are two boxes in front of you. In one of them, there is a little monkey with a cymbal, whilst the other box is empty. In precisely one hour the monkey will clang its cymbal.

While you wait, you produce an estimate of the probability of the monkey being in the first box. Let's assume that you form your last estimate, p, three seconds before the monkey clangs its cymbal. You can see the countdown and you know that it's your final estimate, partly because you're slow at arithmetic.

Let Omega be an AI that can perfectly simulate your entire deliberation process. Before you entered the room, Omega predicted what your last probability estimate would be and decided to place the monkey in a box such as to mess with you. Let q be the probability of Omega placing the monkey in the first box. In particular, Omega, sets q=p/2, unless p=0 or you haven't formed a probability estimate, in which case q=1.

What probability should you expect that the monkey is in the first box?

I think it's fairly clear that this is a no-win situation. No matter what the final probability estimate you form before clanging, as soon as you've locked it in, you know that it is incorrect, even if you haven't heard the clanging yet. You can try to escape this, but there's no reason that the universe has to play nice.

This problem can be seen as a variation on Death in Damascus. I designed this problem to reveal that the core challenge Death in Damascus poses isn't just that another process in the world can depend upon your decision, but that it can depend upon your expectations even if you don't actually make a decision based upon those expectations.

I also find this problem as a useful intuition pump as I think it's clearer that it's a no-win situation than in other similar problems. In Newcomb's problem, it's easy to get caught up thinking about the Principle of Dominance. In Death in Damascus, you can confuse yourself trying to figure out whether CDT recommends staying or fleeing. At least to me, in this problem it is clearer it is a dead end and that there's no way to beat Omega.

This is also a useful intuition pump for the Evil Genie Puzzle. When I first discovered this puzzle, I felt immensely confused that no matter which decision that you made you would immediately regret it. However, the complexity of the puzzle made it complicated for me to figure out exactly what to make of it, so when trying to solve it I came up with this problem as something easier to grok. I guess my position after considering the Unexpected Clanging is that you just have to accept that a sufficiently powerful agent may be able to mess with you like this and that you just have to deal with it. (I'll leave a more complete analysis to a future post).

# 9

New Comment
3 comments, sorted by Click to highlight new comments since:

There is incentive for hidden expectation/cognition that Omega isn't diagonalizing (things like creating new separate agents in the environment). Also, at least you can know how ground truth depends on official "expectation" of ground truth. Truth of knowledge of this dependence wasn't diagonalized away, so there is opportunity for control.

Interesting. This prank seems to be one you could play on a Logical Inductor, I wonder what the outcome would be? One fact that's possibly related is that computable functions are continuous. This would imply that whatever computable function Omega applies to your probability estimate, there exists a fixed point probability you can choose where you'll be correct about the monkey probability. Of course if you're a bounded agent thinking for a finite amount of time, you might as well be outputting rational probability estimates, in which case functions like become computable for Omega.

Suppose that I decide that my opinion on the location of the monkey will be left or right dependent on one bit of quantum randomness, which I will sample sufficiently close to the deadline that my doing so is outside Omega's backward lightcone at the time of the deadline, say a few tens of nanoseconds before the deadline if Omega is at least a few tens of feet away from me and the two boxes? By the (currently believed to be correct) laws of quantum mechanics, qbits cannot be cloned, and by locality, useful information cannot propagate faster than light, so unless Omega is capable of breaking very basic principles of (currently hypothesized) physical laws – say, by having access to faster-than-light travel or a functioning time loop not enclosed by an event horizon, or by having root access to a vast quantum-mechanics simulator that our entire universe is in fact running on – then it physically cannot predict this opinion. Obviously we have some remaining Knightian-uncertainty as to whether the true laws of physics (as opposed to our current best guess of them) allow either of these things or our universe is in fact a vast quantum simulation — but it's quite possible that the answer to the physics question is in fact 'No', as all current evidence suggests, in which case no matter how much classical or quantum computational power Omega throws at the problem there are random processes that it simply cannot reliably predict the outcome of.

[Also note that there is some actual observable evidence on the subject of the true laws of physics in this regard: the Fermi paradox, of why no aliens colonized Earth geological ages ago, gets even harder to explain if our universe's physical laws allow those aliens access to FTL and/or time loops.]

Classically, any computation can be simulated given its initial state and enough computational resources. In quantum information theory, that's also true, but a very fundamental law, the no-cloning theorem, implies that the available initial state information has to be classical rather than quantum, which means that the random results of quantum measurements in the real system and any simulation are not correlated. So quantum mechanics means that we do have access to real randomness that no external attacker can predict, regardless of their computational resources. Both quantum mechanical coherence and information not being able to travel faster than light-speed also provide ways for us to keep a secret so that it's physically impossible for it to leak for a short time.

So as long as Omega is causal (rather than being acausal or the sysop of our simulated universe) and we're not badly mistaken about the fundamental nature of physical laws, there are things that it's actually physically impossible for Omega to do, and beating the approach I outlined above is one of them. (As opposed to, say, using nanotech to sabotage my quantum-noise generator, or indeed to sabotage me, which are physically possible.)

So designing ideal decision theories for the correct way to act in a classical universe in the presence of other agents with large computational resources able to predict you perfectly doesn't seem very useful to me. We live in a quantum universe, initial state information will never be perfect, agents are highly non-linear systems, so quantum fluctuations getting blown up to classical scales by non-linear effects will soon cause a predictive model to fail after a few coherence times followed by a sufficient number of Lyapunov times. It's quite easy to build a system whose coherence and Lyapunov times are deliberately made short so that it's impossible to predict over quite short periods, if it wants to be (for example, continuously feed the output from a quantum random noise generator into perturbing the seed of a high-cryptographic-strength pseudo-random-number generator run on well-shielded hardware, ideally quantum hardware).

Of course, in a non-linear system, it's still possible to predict the climate far further ahead than you can predict the weather: if Omega has sufficiently vast quantum computational resources, it can run many full-quantum simulations of the entire system of every fundamental particle in me and my environment as far as I can see (well, apart from incoming photons of starlight, whose initial state it doesn't have access to), and extract statistics from this ensemble of simulations. But that doesn't let Omega predict if I'll actually guess left vs right, just determine that it's 50:50. Also (unless physical law is a great deal weirder than we believe), Omega is not going to be able to run these simulations as fast as real physics can happen — humans are 70% warm water, which contains a vast amount of quantum thermal entropy being shuffled extremely fast, some of it moving at light-speed as infra-red photons, and human metabolism is strongly and non-linearly coupled to this vast quantum-random-number-generator via diffusion and Brownian motion: so because of light-speed limits, the quantum processing units that the simulation was run on would need to be smaller than individual water molecules to be able to run the simulation in real-time. [It just might be possible to build something like that in the outer crust of a neuron star if there's some sufficiently interesting nucelonic chemistry under some combination of pressure and magnetic field strength there, but if so Omega is a long way away, and has many years of light-speed delay on anything they do around here.]

What Omega can do is run approximate simulations of some simplified heuristic of how I work, if one exists. If my brain was a digital computer, this might be very predictive. But a great deal of careful engineering has gone into making digital computers behave (under their operating conditions) in a way that's extremely reliably predictable by a specific simplified heuristic that doesn't require a full atomic-scale simulation. Typical physical or biological systems just don't have this property. Engineering something to ensure that it definitely doesn't have this property is easy, and in any environment containing agents with more computation resources than you, seems like a very obvious precaution.

So, an agent can easily arrange to act unpredictably, by acting randomly based on a suitably engineered randomness source rather than optimizing. Doing so makes its behavior depend on the unclonable quantum details of its initial state, so the probabilities can be predicted but the outcome cannot. In practice, even though they haven't been engineered for it, humans probably also have this property over some sufficiently long timescale (seconds, minutes or hours, perhaps), when they're not attempting to optimize the outcome of their actions.

[Admittedly, humans leak a vast amount of quantum information about their internal state in the form of things like infra-red photons emitted from their skin, but attempting to interpret those to try to keep a vast full-quantum simulation of every electron and nucleus inside a human (and everything in their environment) on-track would clearly require running at least that much quantum calculation (almost certainly many copies of it) in at least real-time, which again due to light-speed limits would require quantum computing elements smaller than water-molecule size. So again, it's not just way outside current technological feasibility, it's actually physically impossible, with the conceivable exception of inside the crust of a neutron star.]

As a more general version of this opinion, while we may have to worry about Omegas whose technology is far beyond ours, as long as they live in the same universe as us, there are some basic features of physical law that we're pretty sure are correct and would thus also apply even to Omegas. If we had managed to solve the alignment problem contingent on basic physical assumptions like information not propagating faster then the speed of light, time loops being impossible outside event horizons, and quanta being unclonable, then personally (as a physicist) I wouldn't be too concerned. Your guess about the Singularity may vary.