Epistemic status: Random ranting about things I'm confused about. Hopefully this post makes you more confused about optimization/agency/etc.

 

A few parables of counterfactuals

  • What if  equaled ?
    • American: Hmm, Yeah. What if giving me two dollars and then another two had me end up with ? I can imagine that.
    • Me: What! This can obviously never happen! This would break literally everything! It's literally a logical contradiction!
  • What if Quantum computers could solve NP-complete problems?
    • Me: Hmm sounds cool, I can imagine-
    • Scott Aaronson: What! They obviously can't because (unintelligible). You'd have to fundamentally alter how reality works, and even then things might not be salvageable.
  • What if there wasn't a storm today in Texas?
    • Ancient Greek: I'm imagining Zeus wasn't as angry down, perhaps Hera calmed him down.
    • Weather Expert: I knew there'd be a storm 3 days ago, if there wasn't a storm today, that would necessarily change the weather in the adjacent states over the past few days.
    • Superintelligence: Uh... There had to be a storm today...

What is optimization?

 

Suppose you view optimization as "pushing the world into low-probability states." Consider the following

  • Does an asteroid perform optimization? We can predict its path years in advance, so how "low-probability" was the collision?
  • Many here would intuitively agree that Elon Musk is a powerful optimizer (pushed the world into low-probability states). Yet, a sufficiently powerful predictor wouldn't have been surprised by Tesla and SpaceX succeeding.
  • Does the bible perform optimization (insofar as the world looks different in the counterfactual without it)? Or does the "credit" go to its authors? (Same with an imaginary being everyone believes in)

Can we really say optimization is a thing in the territory? Or is it an artifact of the map?

The subjectiveness of probability "infects" all the concepts that use it. As do the limitations of the theory (like being compute-bounded; aka logical counterfactuals)

 

For example, if optimization is pushing the universe into subjectively unlikely states then all the confusion gets pushed into the word "pushing"[1].


This is what philosophy does to you. This is why bits of the universe shouldn't think too hard about themselves.

I suspect I've been nerdsniped by a wrong question somehow. This line of thought doesn't seem productive for aligning the AI... Curious to hear takes in the comments.

 

  1. ^

    Pun intended.

New to LessWrong?

New Comment
4 comments, sorted by Click to highlight new comments since: Today at 5:28 AM
[-]ZT51y20

Does an asteroid perform optimization? We can predict its path years in advance, so how "low-probability" was the collision?

It does, because in the counterfactual world where the asteroid didn't exist, the collision would not have happened.
The "low probability" refers to some prior distribution. Once the collision has happen, it has probability 1.

Suppose you view optimization as "pushing the world into low-probability states."

That is the source of your confusion. Because once an event has happened, it has probability 1. But the event will happen, so it already has probability 1.

The time-bound perspective on this is: a sufficiently powerful predictor can anticipate the effect of the optimization. But it still requires the optimizer to actually perform the optimization, through its actions/existence.

The timeless perspective on this is: the mere existence of the optimizer shifts (or rather, has already shifted) the probabilities in the world where it exists towards the states it optimizes for, compared to the counterfactual world where that optimizer does not exist.

Does the bible perform optimization (insofar as the world looks different in the counterfactual without it)? Or does the "credit" go to its authors? (Same with an imaginary being everyone believes in)

Credit does not have to add up to 100%. Each of the links in a chain is 100% responsible for chain maintaining its integrity. If it would fail, the entire chain would fail. If a chain has 10 links, that adds up to 1000%.

I suspect I’ve been nerdsniped by a wrong question somehow.

"What if X happened?" means "what if X happened, and the set of things I can and do think of when analyzing events in the implied context otherwise stayed the same?" This set doesn't include a complete causal chain (and, since you're a finite human, couldn't possibly do so.)

"What if quantum computers could solve P-=NP?" doesn't mean you should consider the effect that quantum computers have on other things because when you think about those other things your chain of reasoning normally won't go all the way back to the relevant math and physics.

You could choose to go back to math and physics anyway, but by doing so you are misreading the question--the question implies "only go back as far as you normally would go." You could also say "well, the implied context is 'make deductions about math and physics'", in which case yeah, it's a good objection, but you may not be very good at reading implied contexts.

Interesting perspective, kinda reminds me of the ROME paper where it seems to only do "shallow counterfactuals".

I think an optimizer is a thing that makes decisions that are aimed at optimizing a certain quantity.  I think, when you talk about what a decisionmaker does and its effects, you're essentially saying "What if their decisions were this, vs what if their decisions were that?".  If you say "Well, actually their personality and known objectives dictate that their decisions would (almost certainly) be this", then you're not considering them as a decisionmaker, but as one of the fixed features of the world.

It is arguably the case that, along the lines of your analogies, the general question of "What if their decisions were different?" is sometimes divorced from reality—e.g. "Yes, the dictator's bodyguards do have the power to kill him, but they've also been selected for being loyal and unlikely to do that, so it doesn't make much sense to talk as though that's a real possibility."  Though, at least to my knowledge, humans are sufficiently unpredictable, and things like unexpected mental illness can happen, such that you can never really say the chance is zero that a human will decide to do something.  With a computer program, you can get a lot closer to certainty—but there are gamma ray bitflips and such.  So I would say, it's pretty much always physically possible for the decisionmaker to pick anything; whether it's likely is a different question.