All of Drake Thomas's Comments + Replies

An example of the sort of strengthening I wouldn't be surprised to see is something like "If  is not too badly behaved in the following ways, and for all  we have [some light-tailedness condition] on the conditional distribution , then catastrophic Goodhart doesn't happen." This seems relaxed enough that you could actually encounter it in practice.

2Thomas Kwa23d
Suppose that we are selecting for U=X+V where V is true utility and X is error. If our estimator is unbiased (E[X|V=v]=0 for all v) and X is light-tailed conditional on any value of V, do we have limt→∞E[V|X+V≥t]=∞? No; here is a counterexample. Suppose that V∼N(0,1), and X|V∼N(0,4) when V∈[−1,1], otherwise X=0. Then I think limt→∞E[V|X+V≥t]=0. This is worrying because in the case where V∼N(0,1) and X∼N(0,4) independently, we do get infinite V. Merely making the error *smaller* for large values of V causes catastrophe. This suggests that success caused by light-tailed error when V has even lighter tails than X is fragile, and that these successes are “for the wrong reason”: they require a commensurate overestimate of the value when V is high as when V is low.

I'm not sure what you mean formally by these assumptions, but I don't think we're making all of them. Certainly we aren't assuming things are normally distributed - the post is in large part about how things change when we stop assuming normality! I also don't think we're making any assumptions with respect to additivity;  is more of a notational or definitional choice, though as we've noted in the post it's a framing that one could think doesn't carve reality at the joints. (Perhaps you meant something different by additivity, though - feel... (read more)

2rotatingpaguro7mo
I wasn't saying you made all those assumption, I was trying to imagine an empirical scenario to get your assumptions, and the first thing to come to my mind produced even stricter ones. I do realize now that I messed up my comment when I wrote Here there should not be Normality, just additivity and independence, in the sense of U−V⊥V. Sorry. I do agree you could probably obtain similar-looking results with relaxed versions of the assumptions. However, the same way U−V⊥V seems quite specific to me, and you would need to make a convincing case that this is what you get in some realistic cases to make your theorem look useful, I expect this will continue to apply for whatever relaxed condition you can find that allows you to make a theorem. Example: if you said "I made a version of the theorem assuming there exists f such that f(U,V)⊥V for f in some class of functions", I'd still ask "and in what realistic situations does such a setup arise, and why?"