Selection vs Control

[-]adamShimi5y190Review for 2019 Review

Selection vs Control is a distinction I always point to when discussing optimization. Yet this is not the two takes on optimization I generally use. My favored ones are internal optimization (which is basically search/selection), and external optimization (optimizing systems from Alex Flint’s The ground of optimization). So I do without control, or at least without Abram’s exact definition of control.

Why? Simply because the internal structure vs behavior distinction mentioned in this post seems more important than the actual definitions (which seem constrained by going back to Yudkowski’s optimization power). The big distinction is between doing internal search (like in optimization algorithms or mesa-optimizers) and acting as optimizing something. It is intuitive that you can do the second without the first, but before Alex Flint’s definition, I couldn’t put words on my intuition than the first implies the second.

So my current picture of optimization is Internal Optimization (Internal Search/Selection) \subset External Optimization (Optimizing systems). This means that I think of this post as one of the first instances of grappling at this distinction, without agreeing completely with the way it ends up making that distinction.

[-]johnswentworth5y50

I like this division a lot. One nitpick: I don't think internal optimization is a subset of external optimization, unless we're redrawing the system boundary at some point. A search always takes place within the context of a system's (possibly implicit) world-model; that's the main thing which distinguishes it from control/external optimization. If that world-model does not match the territory, then the system may not successfully optimize anything in its environment, even though it's searching for optimizing plans internally.

[-]adamShimi5y30

Thanks!

My take on internal optimization as a subset of external optimization probably works assuming convergence, because the configuration space capturing the internal state of the program (and its variables) is pushed reliably towards the configurations with a local minimum in the corresponding variable. See here.

Whether that's actually what we want is another question, but I think the point you're mentioning can be captured by whether the target subspace of the configuration space puts constraints on things outside the system (for good cartesian boundaries and all the corresponding subtleties).

[-]johnswentworth5y30

Got it, that's the case I was thinking of as "redrawing the system boundary". Makes sense.

That still leaves the problem that we can write an (internal) optimizer which isn't iterative. For instance, a convex function optimizer which differentiates its input function and then algebraically solves for zero gradient. (In the real world, this is similar to what markets do.) This was also my main complaint on Flint's notion of "optimization": not all optimizers are iterative, and sometimes they don't even have an "initial" point against which we could compare.

[-]adamShimi5y30

I'm a bit confused: why can't I just take the initial state of the program (or of the physical system representing the computer) as the initial point in configuration space for your example? The execution of your program is still a trajectory through the configuration space of your computer.

Personally, my biggest issue with optimizing systems is that I don't know what the "smaller" concerning the target space really means. If the target space has only one state less than the total configuration space, is this still an optimizing system? Should we compute a ratio of measure between target and total configuration space to have some sort of optimizing power?

[-]johnswentworth5y20

The initial state of the program/physical computer may not overlap with the target space at all. The target space wouldn't be larger or smaller (in the sense of subsets); it would just be an entirely different set of states.

Flint's notion of optimization, as I understand it, requires that we can view the target space as a subset of the initial space.

[-]Vlad Mikulik7y120

(I am unfortunately currently bogged down with external academic pressures, and so cannot engage with this at the depth I’d like to, but here’s some initial thoughts.)

I endorse this post. The distinction explained here seems interesting and fruitful.

I agree with the idea to treat selection and control as two kinds of analysis, rather than as two kinds of object – I think this loosely maps onto the distinction we make between the mesa-objective and the behavioural objective. The former takes the selection view of the learned algorithm; the latter takes the control view.

At least speaking for myself (the other authors might have different thoughts on this), the decision to talk explicitly in terms of the selection view in the mesa-optimiser post is based on an intuition that selectors, in general, have more coherently defined counterfactual behaviour. That is, given a very different input, a selector will still select an output that scores well on its mesa-objective, because that’s how selectors work. Whereas a controller, to the degree it optimises for an objective, seems more likely to just completely stop working on a different input. I have fairly low confidence in this argument, however: it seems to me that one can plausibly have pretty coherent counterfactual behaviour in a very broad distribution even without doing selection. And since it is ultimately the behaviour that does the damage, it would be good to have a working distinction that is based purely on that. We (the mesa-optimisation authors) haven’t been able to come up with one.

Another reason to be interested in selectors is that in RL, the learned algorithm is supposed to fill a controller role. So, restricting attention to selectors allows to talk at least somewhat meaningfully about non-optimiser agents, which is otherwise difficult, as any learned agent is in a controller-shaped context.

In any case, I hope that more work happens on this problem, either dissolving the need to talk about optimisation, or at least making all these distinctions more precise. The vagueness of everything is currently my biggest worry about the legitimacy of mesa-optimiser concerns.

[-]abramdemski7y70

Yeah, I agree with most of what you're saying here.

A learned controller which isn't implementing any internal selection seems more likely to be incoherent out-of-distribution (ie lack a strong IRL interpretation of its behavior), as compared with a mesa-optimizer;
However, this is a low-confidence argument at present; it's very possible that coherent controllers can appear w/o necessarily having a behavioral objective which matches the original objective, in which case a version of the internal alignment problem applies. (But this might be a significantly different version of the internal alignment problem.)

I think a crux here is: to what extent are mesa-controllers with simple behavioral objectives going to be simple? The argument that mesa-optimizers can compress coherent strategies does not apply here.

Actually, I think there's an argument that certain kinds of mesa-controllers can be simple: the mesa-controllers which are more like my rocket example (explicit world model; explicit representation of objective within that world model; but, optimal policy does not use any search). There is also other reason to suspect that these could survive techniques which are designed to make sure mesa-optimizers don't arise: they aren't expending a bunch of processing power on an internal search, so, you can't eliminate them with some kind of processing-power device. (Not that we know of any such device that eliminates mesa-optimizers -- but if we did, it may not help with rocket-type mesa-controllers.)

Terminology point: I like the term 'selection' for the cluster I'm pointing at, but, I keep finding myself tempted to say 'search' in an AI context. Possibly, 'search vs control' would be better terminology.

[-]Vlad Mikulik7y70

to what extent are mesa-controllers with simple behavioural objectives going to be simple?

I’m not sure what “simple behavioural objective” really means. But I’d expect that for tasks requiring very simple policies, controllers would do, whereas the more complicated the policy required to solve a task, the more one would need to do some kind of search. Is this what we observe? I’m not sure. AlphaStar and OpenAI Five seem to do well enough in relatively complex domains without any explicit search built into the architecture. Are they using their recurrence to search internally? Who knows. I doubt it, but it’s not implausible.

certain kinds of mesa-controllers can be simple: the mesa-controllers which are more like my rocket example (explicit world-model; explicit representation of objective within that world model; but, optimal policy does not use any search).

The rocket example is interesting. I guess the question for me there is, what sorts of tasks admit an optimal policy that can be represented in this way? Here it also seems to me like the more complex an environment, the more implausible it seems that a powerful policy can be successfully represented with straightforward functions. E.g., let’s say we want a rocket not just to get to the target, but to self-identify a good target in an area and pick a trajectory that evades countermeasures. I would be somewhat surprised if we can still represent the best policy as a set of non-searchy functions. So I have this intuition that for complex state spaces, it’s hard to find pure controllers that do the job well.

[-]abramdemski7y30

Yeah, I agree that this seems possible, but extremely unclear. If something uses a fairly complex algorithm like FFT, is it search? How "sophisticated" can we get without using search? How can we define "search" and "sophisticated" so that the answer is "not very sophisticated"?

[-]Ramana Kumar6y104

Thanks for a great post! I have a small confusion/nit regarding natural selection. Despite its name, I don't think it's a good exemplar of a selection process. Going through the features of a selection process from the start of the post:

can directly instantiate any element of the search space. No: natural selection can only make local modifications to previously instantiated points. But you already dealt with this local search issue in Choices Don't Change Later Choices.
gets direct feedback on the quality of each element. Yes.
quality of element does not depend on previous choices. No, the evaluation of an element in natural selection depends a great deal on previous choices because they usually make up important parts of its environment. I think this is the thrust of the claim that natural selection is online (which I agree with).
only the final output matters. No? From the perspective of natural selection, I think the quality of the current output is what matters.

I'd love to know why natural selection seemed obvious as an example of a selection process, since it did not to me due to its poor score on the checklist above.

[-]johnswentworth5y80Review for 2019 Review

In a field like alignment or embedded agency, it's useful to keep a list of one or two dozen ideas which seem like they should fit neatly into a full theory, although it's not yet clear how. When working on a theoretical framework, you regularly revisit each of those ideas, and think about how it fits in. Every once in a while, a piece will click, and another large chunk of the puzzle will come together.

Selection vs control is one of those ideas. It seems like it should fit neatly into a full theory, but it's not yet clear what that will look like. I revisit the idea pretty regularly (maybe once every 3-4 months) to see how it fits with my current thinking. It has not yet had its time, but I expect it will (that's why it's on the list, after all).

Bearing in mind that the puzzle piece has not yet properly clicked, here are some current thoughts on how it might connect to other pieces:

Selection and control have different type signatures.
A selection process optimizes for the values of variables in some model, which may or may not correspond anything in the real world. Human values seem to be like this - see Human Values Are A Function Of Humans' Latent Variables.
A control process, on the other hand, directly optimizes things in its environment. A thermostat, for instance, does not necessarily contain any model of the temperature a few minutes in the future; it just directly optimizes the value of the temperature a few minutes in the future.
The post basically says it, but it's worth emphasizing: reinforcement learning is a control process, expected utility maximization is a selection process. The difference in type signatures between RL and EU maximization is the same as the difference in type signatures between selection and control.
Inner and outer optimizers can have different type signatures: an outer controller (e.g. RL) can learn an inner selector (e.g. utility maximizer), or an outer selector (e.g. a human) can build an inner controller (e.g. a thermostat), or they could match types with or without matching models/objectives. Which things even can match depends on the types involved - e.g. if one of the two is a controller, it may not have any world-model, so it's hard to talk about variables in its world-model corresponding to variables in a selector's world-model.
The Good Regulator Theorem roughly says that the space of optimal controllers always includes a selector (although it doesn't rule out additional non-selectors in that space).

[-]Vaniver5y40Nomination for 2019 Review

In my ~~wayward youth~~formal education, I studied numerical optimization, controls systems, the science of decision-making, and related things, and so some part of me was always irked by the focus on utility functions and issues with them; take this early comment of mine and the resulting thread as an example. So I was very pleased to see a post that touches on the difference between the approaches and the resulting intuitions bringing it more into the thinking of the AIAF.

That said, I also think I've become more confused about what sorts of inferences we can draw from internal structure to external behavior, when there are Church-Turing-like reasons to think that a robot built with mental strategy X can emulate a robot built with mental strategy Y, and both psychology and practical machine learning systems look like complicated pyramids built out of simple nonlinearities that can approximate general functions (but with different simplicity priors, and thus efficiencies). This sort of distinction doesn't seem particularly useful to me from the perspective of constraining our expectations, while it does seem useful for expanding them. [That is, the range of future possibilities seems broader than one would expect if they only thought in terms of selection, or only thought in terms of control.]

[-]Ben Pace5y10Nomination for 2019 Review

This felt to me like an important distinction to think about when thinking about optimization.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

61

61

The Basic Idea

Selection

Control

Bottlecaps & Thermostats

Processes Within Processes

Blurring the Lines: What's the Critical Distinction?

Perfect Feedback

Choices Don't Change Later Choices

Offline vs Online

Internal Features vs Context