Abstraction, Evolution and Gears

[-]TurnTrout5y20

I like this analysis. Whereas Seeking Power is Instrumentally Convergent in MDPs aimed to explain why seeking power is instrumentally convergent, this predicts qualitative properties of resource-hungry policies ("only cares about resource-relevant summary $f (X)$ to the extent it affects long-term action spaces"). I think this frame also makes plain that most sensory reward functions don't differentiate between "humans flourish + I have power" world histories, and "humans perish + I have power" world histories.

More generally, I think mediating-long-term-action-space is part of how we intuitively decide what to call “resources” in the first place.

At this point, I ground my understanding of resources in the POWER(s) formulation (ability to achieve goals in general), and I think your take agrees with my model of that. Most goals we could have care more about the long-term action space, and so care about the stable and abstraction-friendly components of the environment which can widen that space (eg "resources"). Alternatively, resources are those things which tend to increase your POWER.

[-]Donald Hobson5y20

I think that there are some things that are sensitively dependant on other parts of the system, and we usually just call those bits random.

Suppose I had a magic device that returned the exact number of photons in its past lightcone. The answer from this box would be sensitively dependant on the internal workings of all sorts of things, but we can call the output random, and predict the rest of the world.

The flap of a butterflies wing might effect the weather in a months time. The weather is chaotic and sensitively dependant on a lot of things, but whatever the weather, the earths orbit will be unaffected (for a while, orbits are chaotic too on million year timescales)

We can make useful predictions (like planetary orbits, and how hot the planets will get) based just on the surface level abstractions like the brightness and mass of a star, but a more detailed models containing more internal workings would let us predict solar flares and supernova.

[-]johnswentworth5y10

I think that there are some things that are sensitively dependant on other parts of the system, and we usually just call those bits random.

One key piece missing here: the parts we call "random" are not just sensitively dependent on other parts of the system, they're sensitively dependent on many other parts of the system. E.g. predicting the long-run trajectories of billiard balls bouncing off each other requires very precise knowledge of the initial conditions of every billiard ball in the system. If we have no knowledge of even just one ball, then we have to treat all the long-run trajectories at random.

That's why sensitive dependence on many variables matters: lack of knowledge of just one of them wipes out all of our signal. If there's a large number of such variables, then we'll always be missing knowledge of at least one, so we call the whole system random.

[-]romeostevensit5y10

Mild progress on intentional stance for me: take a themostat. Realized you can count up the number of different temperatures the sensor is capable of detecting, number of states that the actuator can do in response (in the case of the thermostat only on/off) and the function mapping between the two. This might start to give some sense of how you can build up a multidimensional map out of multiple sensors and actuators as you do some sort of function combination.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

15

Abstraction, Evolution and Gears

15

Convergent Instrumental Goals

Modularity

Reflection