Archimedes — AI Alignment Forum

I'm having trouble accepting that the Troll Bridge scenario is well-posed as opposed to a Russell-like paradox. Perhaps someone can clarify what I'm missing.

In my mind, there are two options:

If PA is inconsistent, then math is in ruins and any PA-based reasoning for crossing the bridge could be inconsistent and the troll blows up the bridge. Do not cross.
If PA is consistent, then the agent cannot prove U = -10 (or anything else inconsistent) under the assumption that the agent already crossed, and therefore Löb's theorem fails to apply. In this case, there is no weird certainty that crossing is doomed.

Now until/unless PA is proven inconsistent, it's reasonable to assign the majority of probability mass to the prior that PA is, in fact, consistent and we can ignore counterfactuals that depend on proving otherwise since if that's proven, none of the rest of the reasoning matters anyway until foundational logic has been reformulated on a consistent basis.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments