(This proposal received an honorable mention in the ELK prize results, and we believe was classified among strategies which “reward reporters that are sensitive to what’s actually happening in the world”. We do not think that the counterexample to that class of strategies works against our proposal, though, and we have explained why in a note at the end. Feedback, disagreement, and new failure modes are very welcome!)
Basic idea
A Human Simulator only cares about the observations that the human sees and how the human interprets those observations, not the predictor’s understanding of the vault. The Truthful Reporter, by contrast, cares about the predictor’s understanding of the vault, accessed via the posterior distribution... (read 1659 more words →)