Identifiability Problem for Superrational Decision Theories

"The same" in what sense? Are you saying that what I described in the context of game theory is not surprising, or outlining a way to explain it in retrospect?

Communication won't make a difference if you're playing with a copy.

13dWell, if I understand the post correctly, you're saying that these two problems
are fundamentally the same problem, and so rationality should be able to solve
them both if it can solve one. I disagree with that, because from the
perspective of distributed computing (which I'm used to), these two problems are
exactly the two kinds of problems that are fundamentally distinct in a
distributed setting: agreement and symmetry-breaking.
Actually it could. Basically all of distributed computing assumes that every
process is running the same algorithm, and you can solve symmetry-breaking in
this case with communication and additional constraint on the scheduling of
processes (the difficulty here is that the underlying graph is symmetric,
whereas if you had some form of asymmetry (like three processes in a line, such
that the one in the middle has two neighbors but the others only have one), they
you can use directly that asymmetry to solve symmetry-breaking.
(By the way, you just gave me the idea that maybe I can use my knowledge of
distributed computing to look at the sort of decision problems where you play
with copies? Don't know if it would be useful, but that's interesting at least)

Phylactery Decision Theory

Another problem with this is that it isn't clear how to form the hypothesis "I have control over X".

You don't. I'm using talk about control sometimes to describe what the agent is doing from the outside, but the hypothesis it believes all have a form like "The variables such and such will be as if they were set by BDT given such and such inputs".

One problem with this is that it doesn't actually rank hypotheses by which is best (in expected utility terms), just how much control is implied.

For the first setup, where its trying to learn what it has control ov... (read more)

26dRight, but then, are all other variables unchanged? Or are they influenced
somehow? The obvious proposal is EDT -- assume influence goes with correlation.
Another possible answer is "try all hypotheses about how things are influenced."

Reflective Bayesianism

From my perspective, Radical Probabilism is a gateway drug.

This post seemed to be praising the virtue of returning to the lower-assumption state. So I argued that in the example given, it took more than knocking out assumptions to get the benefit.

So, while I agree, I really don't think it's cruxy.

It wasn't meant to be. I agree that logical inductors seem to de facto implement a Virtuous Epistemic Process, with attendent properties, whether or not they understand that. I just tend to bring up any interesting-seeming thoughts that are triggered during ... (read more)

26dAgreed. Simple Bayes is the hero of the story in this post, but that's more
because the simple bayesian can recognize that there's something beyond.

Reflective Bayesianism

Either way, we've made assumptions which tell us which Dutch Books are valid. We can then check what follows.

Ok. I suppose my point could then be made as "#2 type approaches aren't very useful, because they assume something thats no easier than what they provide".

I think this understates the importance of the Dutch-book idea to the actual construction of the logical induction algorithm.

Well, you certainly know more about that than me. Where did the criterion come from in your view?

This part seems entirely addressed by logical induction, to me.

Quite p... (read more)

I wanted to separate what work is done by radicalizing probabilism in general, vs logical induction specifically.

From my perspective, Radical Probabilism is a gateway drug. Explaining logical induction intuitively is hard. Radical Probabilism is easier to explain and motivate. It gives reason to believe that there's something interesting in the direction. But, as I've stated before, I have trouble comprehending how Jeffrey *correctly predicted* that there's something interesting here, without logical uncertainty as a motivation. In hindsight, I feel hi... (read more)

Reflective Bayesianism

What is actually left of Bayesianism after Radical Probabilism? Your original post on it was partially explaining logical induction, and introduced assumptions from that in much the same way as you describe here. But without that, there doesn't seem to be a whole lot there. The idea is that all that matters is resistance to dutch books, and for a dutch book to be fair the bookie must not have an epistemic advantage over the agent. Said that way, it depends on some notion of "what the agent could have known at the time", and giving a coherent account of thi... (read more)

28dPart of the problem is that I avoided getting too technical in Radical
Probabilism, so I bounced back and forth between different possible versions of
Radical Probabilism without too much signposting.
I can distinguish at least three versions:
1. Jeffrey's version. I don't have a good source for his full picture. I get
the sense that the answer to "what is left?" is "very little!" -- EG, he
didn't think agents have to be able to articulate probabilities. But I am
not sure of the details.
2. The simplification of Jeffrey's version, where I keep the Kolmogorov axioms
(or the Jeffrey-Bolker axioms) but reject Bayesian updates.
3. Skyrms' deliberation dynamics. This is a pretty cool framework and I
recommend checking it out (perhaps via his book The Dynamics of Rational
Deliberation). The basic idea of its non-bayesian updates is, it's fine so
long as you're "improving" (moving towards something good).
4. The version represented by logical induction.
5. The Shafer & Vovk version. I'm not really familiar with this version, but I
hear it's pretty good.
(I can think of more, but I cut myself off.)
Making a broad generalization, I'm going to stick things into camp #2 above or
camp #4. Theories in camp #2 have the feature that they simply assume a solid
notion of "what the agent could have known at the time". This allows for a nice
simple picture in which we can check Dutch Book arguments. However, it does lend
itself more easily to logical omniscience, since it doesn't allow a nuanced
picture of how much logical information the agent can generate. Camp #4 means we
do give such a nuanced picture, such as the poly-time assumption.
Either way, we've made assumptions which tell us which Dutch Books are valid. We
can then check what follows.
I think this understates the importance of the Dutch-book idea to the actual
construction of the logical induction algorithm. The criterion came first, and
the construction was finished soon after.

Troll Bridge

If you're reasoning using PA, you'll hold open the possibility that PA is inconsistent, but you

won'thold open the possibility that . You believe theworldis consistent. You're just not so sure about PA.

Do you? This sounds like PA is not actually the logic you're using. Which is realistic for a human. But if PA is indeed inconsistent, and you don't have some further-out system to think in, then what is the difference to you between "PA is inconsistent" and "the world is inconsistent"? In both cases you just believe everything and its negatio... (read more)

316dMaybe this is the confusion. I'm not using PA. I'm assuming (well, provisionally
assuming) PA is consistent.
If PA is consistent, then an agent using PA believes the world is consistent --
in the sense of assigning probability 1 to tautologies, and also assigning
probability 0 to contradictions.
(At least, 1 to tautologies it can recognize, and 0 to contradictions it can
recognize.)
Hence, I (standing outside of PA) assert that (since I think PA is probably
consistent) agents who use PA don't know whether PA is consistent, but, believe
the world is consistent.
If PA were inconsistent, then we need more assumptions to tell us how
probabilities are assigned. EG, maybe the agent "respects logic" in the sense of
assigning 0 to refutable things. Then It assigns 0 to everything. Maybe it
"respects logic" in the sense of assigning 1 to provable things. Then it assigns
1 to everything. (But we can't have both. The two notions of "respect logic" are
equivalent if the underlying logic is consistent, but not otherwise.)
But such an agent doesn't have much to say for itself anyway, so it's more
interesting to focus on what the consistent agent has to say for itself.
And I think the consistent agent very much does not "hold open the possibility"
that the world is inconsistent. It actively denies this.

Troll Bridge

If I'm using PA, I can prove that .

Sure, thats always true. But sometimes its also true that . So unless you believe PA is consistent, you need to hold open the possibility that the ball will both (stop and continue) and (do at most one of those). But of course you can also prove that it will do at most one of *those*. And so on. I'm not very confident whats right, ordinary imagination is probably just misleading here.

It seems particularly absurd that, in some sense, the reason you think that is just

because you think that.

The fa... (read more)

321dI think you're still just confusing levels here. If you're reasoning using PA,
you'll hold open the possibility that PA is inconsistent, but you won't hold
open the possibility thatA&¬A. You believe the world is consistent. You're just
not so sure about PA.
I'm wondering what you mean by "hold open the possibility".
* If you mean "keep some probability mass on this possibility", then I think
most reasonable definitions of "keep your probabilities consistent with your
logical beliefs" will forbid this.
* If you mean "hold off on fully believing things which contradict the
possibility", then obviously the agent would hold off on fully believing PA
itself.
* Etc for other reasonable definitions of holding open the possibility (I
claim).

Troll Bridge

Heres what I imagine the agent saying in its defense:

Yes, of course I can control the consistency of PA, just like everything else can. For example, imagine that you're using PA and you see a ball rolling. And then in the next moment, you see the ball stopping and you also see the ball continuing to roll. Then obviously PA is inconsistent.

Now you might think this is dumb, because its impossible to see that. But why do you think its impossible? Only because its inconsistent. But if you're using PA, you must believe PA really might be inconsistent, so you ca... (read more)

223dThis part, at least, I disagree with. If I'm using PA, I can prove that¬(A&¬A).
So I don't need to believe PA is consistent to believe that the ball won't stop
rolling and also continue rolling.
On the other hand, I have no direct objection to believing you can control the
consistency of PA by doing something else than PA says you will do. It's not a
priori absurd to me. I have two objections to the line of thinking, but both are
indirect.
1. It seems absurd to think that if you cross the bridge, it will definitely
collapse. It seems particularly absurd that, in some sense, the reason you
think that is just because you think that.
2. From a pragmatic/consequentialist perspective, thinking in this way seems to
result in poor outcomes.

Limiting Causality by Complexity Class

The first sentence of your first paragraph appears to appeal to experiment, while the first sentence of your second paragraph seems to boil down to "Classically, X causes Y if there is a significant statistical connection twixt X and Y."

No. "Dependence" in that second sentence does not mean causation. It just means statistical dependence. The definition of dependence is important because an intervention must be statistically independent from things "before" the intervention.

None of these appear to involve intervention.

These are methods of causal inf... (read more)

Limiting Causality by Complexity Class

Pearl's answer, from IIRC Chapter 7 of Causality, which I find 80% satisfying, is about using external knowledge about repeatability to consider a system in isolation. The same principle gets applied whenever a researcher tries to shield an experiment from outside interference.

This is actually a good illustration of what I mean. You can't shield an experiment from outside influence entirely, not even in principle, because its *you* doing the shielding, and your activity is caused by the rest of the world. If you decide to only look at a part of the world, on... (read more)

02moCausal inference has long been about how to take small assumptions about
causality and turn them into big inferences about causality. It's very bad at
getting causal knowledge from nothing. This has long been known.
For the first: Well, yep, that's why I said I was only 80% satisfied.
For the second: I think you'll need to give a concrete example, with edges,
probabilities, and functions. I'm not seeing how to apply thinking about
complexity to a type causality setting, where it's assumed you have actual
probabilities on co-occurrences.

Limiting Causality by Complexity Class

What I had in mind was increasing precision of Y.

03moI guess that makes sense. Thanks for clarifying!

Limiting Causality by Complexity Class

X and Y are variables for events. By complexity class I mean computational complexity, not sure what scaling parameter is supposed to be there?

03moComputational complexity only makes sense in terms of varying sizes of inputs.
Are some Y events "bigger" than others in some way so that you can look at how
the program runtime depends on that "size"?

Normativity

we have an updating process which can change its mind about any particular thing; and that updating process itself is not the ground truth, but rather has beliefs (which can change) about what makes an updating process legitimate.

This should still be a strong formal theory, but one which requires weaker assumptions than usual

There seems to be a bit of a tension here. What you're outlining for most of the post still requires a formal system with assumptions within which to take the fixed point, but then that would mean that it can't change its mind about *an*... (read more)

45moIt's sort of like the difference between a programmable computer vs an arbitrary
blob of matter. A programmable computer provides a rigid structure which can't
be changed, but the set of assumptions imposed really is quite light. When
programming language designers aim for "totally self-revising systems"
(languages with more flexibility in their assumptions, such as Lisp), they don't
generally attack the assumption that the hardware should be fixed. (Although
occasionally they do go as far as asking for FPGAs.)
(a finite approximation of) Solomonoff Induction can be said to make "very few
assumptions", because it can learn a wide variety of programs. Certainly it
makes less assumptions than more special-case machine learning systems. But it
also makes a lot more assumptions than the raw computer. In particular, it has
no allowance for updating against the use of Bayes' Rule for evaluating which
program is best.
I'm aiming for something between the Solomonoff induction and the programmable
computer. It can still have a rigid learning system underlying it, but in some
sense it can learn any particular way of selecting hypotheses, rather than being
stuck with one.
This seems like a rather excellent question which demonstrates a high degree of
understanding of the proposal.
I think the answer from my not-necessarily-foundationalist but
not-quite-pluralist perspective (a pluralist being someone who points to the
alternative foundations proposed by different people and says "these are all
tools in a well-equipped toolbox") is:
The meaning of a confused concept such as "the real word for X" is not
ultimately given by any rigid formula, but rather, established by long
deliberation on what it can be understood to mean. However, we can understand a
lot of meaning through use. Pragmatically, what "the real word for X" seems to
express is that there is a correct thing to call something, usually uniquely
determined, which can be discovered through investigation (EG by askin

What is the interpretation of the do() operator?

But in a newly born child or blank AI system, how does it acquire causal models?

I see no problem assuming that you start out with a prior over causal models - we do the same for propabilistic models after all. The question is how the updating works, and if, assuming the world has a causal structure, this way of updating can identify it.

I myself think (but I haven't given it enough thought) that there might be a bridge from data to causal models though falsification. Take a list of possible causal models for a given problem and search through your d... (read more)

What is the interpretation of the do() operator?

If Markov models are simple explanations of our observations, then what's the problem with using them?

To be clear, by total propability distribution I mean a distribution over all possible conjunctions of events. A Markov model also creates a total propability distribution, but there are multiple Markov models with the same propability distribution. Believing in a Markov model is more specific, and so if we could do the same work with just propability distributions, then Occam would seem to demand we do.

The surface-level answer to your question would... (read more)

08moI think Judea Pearl would answer that the do() operator is the most
reductionistic explanation that is possible. The point of the do calculus is
precisely that it can't be found in the data (the difference between do(x) and
"see(x)") and requires causal assumptions. Without a causal model, there is no
do operator. And conversely, one cannot create a causal model from pure data
alone- The do operator is on a higher rung of "the ladder of causality" from
bare probabilities.
I feel like there's a partial answer to your last question in that do-calculus
is to causal reasoning what the bayes rule is to probability. The do calculus
can be derived from probability rules and the introduction of the do() operator-
but the do() operator itself is something can not be explained in non causal
terms. Pearl believes we inherently use some version of do calculus when we
think about causality.
These ideas are all in Pearls "the book of why".
But now I think your question is where do the models come from? For researchers,
the causal models they create come from background information they have of the
problem they're working with. A confounder is possible between these parameters,
but not those because of randomization etc. etc.
But in a newly born child or blank AI system, how does it acquire causal models?
If that is explained, then we have answered your question. I don't have a good
answer.
I myself think (but I haven't given it enough thought) that there might be a
bridge from data to causal models though falsification. Take a list of possible
causal models for a given problem and search through your data. You might not be
able to prove your assumptions, but you might be able to rule causal models out,
if they suppose there is a causal relation between two variables that show no
correlation at all.
The trouble is, you don't know whether you can rule out the correlation, or if
there is a correlation which doesn't show in the data because of a confounder.
It seems plausible t

The Dualist Predict-O-Matic ($100 prize)

One possibility is that it's able to find a useful outside view model such as "the Predict-O-Matic has a history of making negative self-fulfilling prophecies". This could lead to the Predict-O-Matic making a negative prophecy ("the Predict-O-Matic will continue to make negative prophecies which result in terrible outcomes"), but this prophecy wouldn't be selected for being self-fulfilling. And we might usefully ask the Predict-O-Matic whether the terrible self-fulfilling prophecies will continue conditional on us tak... (read more)

No. I think:

As outlined in the last paragraph of the post. I want to convince people that TDT-like decision theories won't give a "neat" game theory, by giving an example where they're even less neat than classical game theory.

I think you're thinking about a realistic case (same algorithm,

similarenvironment... (read more)