This is a special post for short-form writing by Cinera Verinia. Only they can create top-level comments. Comments here also appear on the Shortform Page and All Posts page.
1 comment, sorted by Click to highlight new comments since: Today at 6:04 PM

Some Nuance on Learned Optimisation in the Real World

I think mesa-optimisers should not be thought of as learned optimisers, but systems that employ optimisation/search as part of their inference process.

The simplest case is that pure optimisation during inference is computationally intractable in rich environments (e.g. the real world), so systems (e.g. humans) operating in the real world, do not perform inference solely by directly optimising over outputs.

Rather optimisation is employed sometimes as one part of their inference strategy. That is systems only optimise their outputs part of the time (other [most?] times they execute learned heuristics[1]). 

Furthermore, learned optimisation in the real world seems to be more "local"/task specific (i.e. I make plans to achieve local, particular objectives [e.g.planning a trip from London to Edinburgh]. I have no global objective that I am consistently optimising for over the duration of my lifetime). 

I think this is basically true for any feasible real world intelligent system[2]. So learned optimisation in the real world is: 

  1. Partial[3]
  2. Local

Do these nuances of real world mesa-optimisers change the nature of risks from learned optimisation?

Cc: @evhub, @beren, @TurnTrout, @Quintin Pope.

  1. ^

    Though optimisation (e.g. planning) might sometimes be employed to figure out which heuristic to deploy at a particular time.

  2. ^

    For roughly the reasons why I think fixed immutable terminal goals are antinatural, see e.g.: "Is "Strong Coherence" Anti-Natural?"

    Alternatively, I believe that real world systems learn contextual heuristics (downstream of historical selection) that influence decision making ("values") and not fixed/immutable terminal "goals". See also: "why assume AGIs will optimize for fixed goals?"

  3. ^

    This seems equivalent to Beren's concept of "hybrid optimisation"; I mostly use "partial optimisation", because it feels closer to the ontology of the Risks From Learned Optimisation paper. As they define optimisation, I think learned algorithms operating in the real world just will not be consistently optimising for any global objective.