AI ALIGNMENT FORUMTags
AF

Wireheading

•

Applied to Assessment of AI safety agendas: think about the downside risk by Roman Leventov 4mo ago

•

Applied to Reward Hacking from a Causal Perspective by Tom Everitt 9mo ago

•

Applied to Note on algorithms with multiple trained components by Steve Byrnes 1y ago

•

Applied to generalized wireheading by Raymond Arnold 1y ago

•

Applied to Four usages of "loss" in AI by Alex Turner 2y ago

•

Applied to Towards deconfusing wireheading and reward maximization by leogao 2y ago

•

Applied to Artificial intelligence wireheading by Big Tony 2y ago

•

Applied to Reward is not the optimization target by Alex Turner 2y ago

•

Applied to Reinforcement Learner Wireheading by Nate Showell 2y ago

•

Applied to Value extrapolation vs Wireheading by Ruben Bloom 2y ago

•

Applied to [Intro to brain-like-AGI safety] 10. The alignment problem by Steve Byrnes 2y ago

•

Applied to [Intro to brain-like-AGI safety] 9. Takeaways from neuro 2/2: On AGI motivation by Steve Byrnes 2y ago

•

Applied to Model-based RL, Desires, Brains, Wireheading by Steve Byrnes 3y ago

Yoav Ravid v1.6.0Jun 3rd 2021 (+14/-197) Fixed links in related pages section, and removed notable posts section

Related pages:~~Related:~~ Complexity of Value, Goodhart's Law, Inner Alignment

Notable Posts

•

Applied to Would Your Real Preferences Please Stand Up? by Yoav Ravid 3y ago

•

Applied to Wireheading Done Right: Stay Positive Without Going Insane by eFish 3y ago

eFish v1.5.0Apr 25th 2021 (+133) add links

External links

Wirehead Hedonism versus paradise engineering by David Pearce

•

Applied to Safely controlling the AGI agent reward function by Koen Holtman 3y ago

•

Applied to Disentangling Corrigibility: 2015-2021 by Koen Holtman 3y ago