This is a wonderful piece and it's so great to hear from somebody so deeply knowledgeable in the field. I wonder if approval reward might be an emergent property. When you're operating under radical uncertainty with more variables than you can possibly model, defaulting to "what would my community approve of" is a computationally efficient heuristic. It's not some holy pro-social module evolution baked into us; it's a deeply rational response to chaos. Even if I tip when no one is watching at a restaurant I will never return to, psychologically I know I am reinforcing behavior that will serve me well in the future. Who's to say what will happen that I can't possibly predict? Maybe later in the day I will be mugged, and that waiter will save my life. By aligning with the collective good unless you absolutely have to deviate, you create conditions where you're statistically more likely to survive. If this is true, then a sufficiently intelligent agent facing genuine uncertainty might converge on something like Approval Reward not because it was programmed in, but because it's rational.
This is a wonderful piece and it's so great to hear from somebody so deeply knowledgeable in the field. I wonder if approval reward might be an emergent property. When you're operating under radical uncertainty with more variables than you can possibly model, defaulting to "what would my community approve of" is a computationally efficient heuristic. It's not some holy pro-social module evolution baked into us; it's a deeply rational response to chaos. Even if I tip when no one is watching at a restaurant I will never return to, psychologically I know I am reinforcing behavior that will serve me well in the future. Who's to say what will happen that I can't possibly predict? Maybe later in the day I will be mugged, and that waiter will save my life. By aligning with the collective good unless you absolutely have to deviate, you create conditions where you're statistically more likely to survive. If this is true, then a sufficiently intelligent agent facing genuine uncertainty might converge on something like Approval Reward not because it was programmed in, but because it's rational.