Thanks for clarifying, I misunderstood your post and must have forgotten about the scope, sorry about that. I'll remove that paragraph. Thanks for the links, I hadn't read those, and I appreciate the pseudocode.
I think most likely I still don't understand what you mean by grader-optimizer, but it's probably better to discuss on your post after I've spent more time going over your posts and comments.
My current guess in my own words is: A grader-optimizer is something that approximates argmax (has high optimization power)? And option (1) acts a bit like a soft optimizer, but with more specific structure related to shards, and how it works out whether to continue optimizing?
Thanks for registering a guess! I would put it as: a grader optimizer is
something which is trying to optimize the outputs of a grader as its terminal
end (either de facto, via argmax, or intent-alignment, as in "I wanna search for
plans which make this function output a high number"). Like, the point of the
optimization is to make the number come out high.
(To help you checksum: It feels important to me that "is good at achieving its
goals" is not tightly coupled to "approximating argmax", as I'm talking about
those terms. I wish I had fast ways of communicating my intuitions here, but I'm
not thinking of something more helpful to say right now; I figured I'd at least
comment what I've already written.)
Thanks for clarifying, I misunderstood your post and must have forgotten about the scope, sorry about that. I'll remove that paragraph. Thanks for the links, I hadn't read those, and I appreciate the pseudocode.
I think most likely I still don't understand what you mean by grader-optimizer, but it's probably better to discuss on your post after I've spent more time going over your posts and comments.
My current guess in my own words is: A grader-optimizer is something that approximates argmax (has high optimization power)?
And option (1) acts a bit like a soft optimizer, but with more specific structure related to shards, and how it works out whether to continue optimizing?