In this case humans are doing the job of transferring from D to D∗, and the training algorithm just has to generalize from a representative sample of D∗ to the test set.
Thanks for the references! I now know that I'm interested specifically in cooperative game theory, and I see that Shoham & Leyton-Brown has a chapter on "coalitional game theory", so I'll take a look.
A proof of the lemma V(a,b∗)≤v∗:
Ah, ok. When you said "obedience" I imagined too little agency — an agent that wouldn't stop to ask clarifying questions. But I think we're on the same page regarding the flavor of the objective.
Might not intent alignment (doing what a human wants it to do, being helpful) be a better target than obedience (doing what a human told it to do)?
My takeaway from this is that if we're doing policy selection in an environment that contains predictors, instead of applying the counterfactual belief that the predictor is always right, we can assume that we get rewarded if the predictor is wrong, and then take maximin.
How would you handle Agent Simulates Predictor? Is that what TRL is for?
The observation can provide all sorts of information about the universe, including whether exploration occurs. The exact set of possible observations depends on the decision problem.
E and O can have any relationship, but the most interesting case is when one can infer E from O with certainty.
Thanks, I made this change to the post.
Yeah, I think the fact that Elo only models the macrostate makes this an imperfect analogy. I think a better analogy would involve a hybrid model, which assigns a probability to a chess game based on whether each move is plausible (using a policy network), and whether the higher-rated player won.
I don't think the distinction between near-exact and nonexact models is essential here. I bet we could introduce extra entropy into the short-term gas model and the rollout would still be superior for predicting the microstate than the Boltzmann distribution.
The sum isn't over i, though, it's over all possible tuples of length n−1. Any ideas for how to make that more clear?