You are viewing revision 1.4.0, last edited by Yoav Ravid

Mesa-Optimization is the situation that occurs when a learned model (such as a neural network) is itself an optimizer. A base optimizer optimizes and creates a mesa-optimizer. Previously work under this concept was called Inner Optimizer or Optimization Daemons. 

Examples

Natural selection is an optimization process (that optimizes for reproductive fitness) that produced humans (who are capable of pursuing goals that no longer correlate reliably with reproductive fitness). In this case, humans are optimization daemons of natural selection. In the context of AI alignment, the concern is that an artificial general intelligence exerting optimization pressure may produce mesa-optimizers that break alignment.1

History

Previously work under this concept was called Inner Optimizer or Optimization Daemons....

(Read More)