AI ALIGNMENT FORUMTags
AF

Reward Functions

EditHistorySubscribe
Discussion (0)
Help improve this page (2 flags)
EditHistorySubscribe
Discussion (0)
Help improve this page (2 flags)
Reward Functions
Random Tag
Contributors
Posts tagged Reward Functions
Most Relevant
3
85Reward is not the optimization target
Alex Turner
8mo
76
3
28Draft papers for REALab and Decoupled Approval on tampering
Jonathan Uesato, Ramana Kumar
2y
2
0
45Scaling Laws for Reward Model Overoptimization
leogao, John Schulman, Jacob Hilton
5mo
5
2
37Seriously, what goes wrong with "reward the agent when it makes you smile"?Q
Alex Turner, johnswentworth
7mo
Q
12
2
20Four usages of "loss" in AI
Alex Turner
6mo
14
1
11$100/$50 rewards for good references
Stuart Armstrong
1y
2
1
26A Short Dialogue on the Meaning of Reward Functions
Leon Lang, Quintin Pope, peligrietzer
4mo
0
1
10Thoughts on reward engineering
Paul Christiano
4y
19
1
6The reward engineering problem
Paul Christiano
4y
1
0
0Reward model hacking as a challenge for reward learning
Erik Jenner
1y
0
1
8Reward functions and updating assumptions can hide a multitude of sins
Stuart Armstrong
3y
2
1
6Probabilities, weights, sums: pretty much the same for reward functions
Stuart Armstrong
3y
0
Add Posts