Lukas Berglund

Wiki Contributions

Comments

How does shard theory explain romantic jealousy? It seems like most people feel jealous when their romantic partner does things like dancing with someone else or laughing at their jokes. How do shards like this form from simple reward circuitry? I'm having trouble coming up with a good story of how this happens. I would appreciate if someone could sketch one out for me.

2. There’s lots of Minecraft videos on YouTube, so you could test a “GPT-3 for Minecraft” approach.

OpenAI just did this exact thing.

I see, thanks for answering. To further clarify, given the reporter's only access to the human's nodes is through the human's answers, would it be equally likely for the reporter to create a mapping to some other Bayes net that is similarly consistent with the answers provided? Is there a reason why the reporter would map to the human's Bayes net in particular?

Potentially silly question: 

In the first counterexample you describe the desired behavior as 

Intuitively, we expect each node in the human Bayes net to correspond to a function of the predictor’s Bayes net. We’d want the reporter to simply apply the relevant functions from subsets of nodes in the predictor's Bayes net to each node in the human Bayes net [...]

After applying these functions, the reporter can answer questions using whatever subset of nodes the human would have used to answer that question.

Why doesn't the reporter skip the step of mapping the predictor's Bayes net to the human's and instead just answer the question using its own nodes? What's the benefit of having the intermediate step that maps the predictor's net to the human's?