I heavily recommend @beren's "Deconfusing Direct vs Amortised Optimisation". It's a very important conceptual clarification that has changed how I think about many issues bearing on technical AI safety.
Currently, it's the most important blog post I've read this year.
This sequence (if I get around to completing it) is an attempt to draw more attention to Beren's conceptual frame and its implications for how to think about issues of alignment and agency.
This first post presents a distillation of the concept, and subsequent posts explore its implications.
Beren introduces a taxonomy categorising intelligent systems according to the kind of optimisation they are performing. I think it's more helpful to think of these as two ends of a spectrum as opposed to distinct discrete categories; sophisticated real world intelligent systems (e.g. humans) appear to be a hybrid of the two approaches.
Naively, direct optimisers can be understood as computing (an approximation of) argmax (or argmin) for a suitable objective function during inference.
Naively, amortised optimisers can be understood as evaluating a (fixed) learned function; they're not directly computing argmax (or argmin) for any particular objective function during inference.
Or strategies, plans, probabilities, categories, etc.; any "output" of the system.
I would add that this function is usually the solution to the objective solved by some form of direct optimiser. I.e. your classifier learns the map from input -> label.