This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Wireheading
•
Applied to
Note on algorithms with multiple trained components
by
Steve Byrnes
5mo
ago
•
Applied to
generalized wireheading
by
Raymond Arnold
7mo
ago
•
Applied to
Four usages of "loss" in AI
by
Alex Turner
8mo
ago
•
Applied to
Towards deconfusing wireheading and reward maximization
by
leogao
8mo
ago
•
Applied to
Artificial intelligence wireheading
by
Big Tony
10mo
ago
•
Applied to
Reward is not the optimization target
by
Alex Turner
10mo
ago
•
Applied to
Reinforcement Learner Wireheading
by
Nate Showell
1y
ago
•
Applied to
Value extrapolation vs Wireheading
by
Ruben Bloom
1y
ago
•
Applied to
[Intro to brain-like-AGI safety] 10. The alignment problem
by
Steve Byrnes
1y
ago
•
Applied to
[Intro to brain-like-AGI safety] 9. Takeaways from neuro 2/2: On AGI motivation
by
Steve Byrnes
1y
ago
•
Applied to
Model-based RL, Desires, Brains, Wireheading
by
Steve Byrnes
2y
ago
Yoav Ravid
v1.6.0
Jun 3rd 2021
(
+14
/
-197
)
Fixed links in related pages section, and removed notable posts section
LW
2
Related pages:
Related:
Complexity of Value
,
Goodhart's Law
,
Inner Alignment
Notable Posts
Are wireheads happy?
Would Your Real Preferences Please Stand Up?
You cannot be mistaken about (not) wanting to wirehead
Wireheading Done Right: Stay Positive Without Going Insane
•
Applied to
Would Your Real Preferences Please Stand Up?
by
Yoav Ravid
2y
ago
•
Applied to
Wireheading Done Right: Stay Positive Without Going Insane
by
eFish
2y
ago
eFish
v1.5.0
Apr 25th 2021
(+133)
add links
LW
1
Are wireheads happy?
Would Your Real Preferences Please Stand Up?
You cannot be mistaken about (not) wanting to wirehead
Wireheading Done Right: Stay Positive Without Going Insane
External links
Wirehead Hedonism versus paradise engineering
by David Pearce
•
Applied to
Safely controlling the AGI agent reward function
by
Koen Holtman
2y
ago
•
Applied to
Disentangling Corrigibility: 2015-2021
by
Koen Holtman
2y
ago
•
Applied to
Hedonic asymmetries
by
MichaelA
3y
ago
•
Applied to
Defining AI wireheading
by
Mark Xu
3y
ago
•
Applied to
Draft papers for REALab and Decoupled Approval on tampering
by
Adam Shimi
3y
ago
Related pages:
Related:Complexity of Value, Goodhart's Law, Inner AlignmentNotable PostsAre wireheads happy?Would Your Real Preferences Please Stand Up?You cannot be mistaken about (not) wanting to wireheadWireheading Done Right: Stay Positive Without Going Insane