11 $100/$50 rewards for good references

by Stuart_Armstrong

3rd Dec 2021

1 min read

5

11

Reward FunctionsAI

Frontpage

New Comment

2 comments, sorted by

top scoring

Click to highlight new comments since: Today at 5:24 PM

[-]gwern4y10

What we'd want is some neural-net style design that generates the coin reward and the move-right reward just from the game data, without any previous knowledge of the setting.

So you're looking for curriculum design/exploration in meta-reinforcement-learning? Something like Enhanced POET/PLR/REPAIRED but where it's not just moving-right but a complicated environment with arbitrary reward functions (eg. using randomly initialized CNNs to map state to 'reward')? Or would hindsight or successor methods count as they relabel rewards for executed trajectories? Would relatively complex generative games like Alchemy or LIGHT count? Self-play, like robotics self-play?

Reply

[-]Stuart_Armstrong4y20

Hey there! Sorry for the delay. $50 awarded to you for fastest good reference. PM me your bank details.

Reply

Moderation Log

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

11

$100/$50 rewards for good references

11