John_Maxwell — AI Alignment Forum

Self-Fulfilling Prophecies Aren't Always About Self-Awareness

This is a belated follow-up to my Dualist Predict-O-Matic post, where I share some thoughts re: what could go wrong with the dualist Predict-O-Matic. Belief in Superpredictors Could Lead to Self-Fulfilling Prophecies In my previous post, I described a Predict-O-Matic which mostly models the world at a fuzzy resolution, and...

Nov 18, 201914

John Maxwell

John Maxwell

John Maxwell

Why GPT wants to mesa-optimize & how we might change this

Eliciting Latent Knowledge Via Hypothetical Sensors

The Dualist Predict-O-Matic ($100 prize)

Self-Fulfilling Prophecies Aren't Always About Self-Awareness

John Maxwell

Why GPT wants to mesa-optimize & how we might change this

Eliciting Latent Knowledge Via Hypothetical Sensors

The Dualist Predict-O-Matic ($100 prize)

Self-Fulfilling Prophecies Aren't Always About Self-Awareness

Eliciting Latent Knowledge Via Hypothetical Sensors

Why GPT wants to mesa-optimize & how we might change this

John_Maxwell's Shortform

The Goodhart Game

Self-Fulfilling Prophecies Aren't Always About Self-Awareness

The Dualist Predict-O-Matic ($100 prize)