Jobst Heitzig — AI Alignment Forum

NEW: apply for my AI Safety Camp project on safety by empowerment: https://www.aisafety.camp/#h.p7ew5r1wioag

Senior Researcher / Lead, FutureLab on Game Theory and Networks of Interacting Agents @ Potsdam Institute for Climate Impact Research.

I'm a mathematician working on collective decision making, game theory, formal ethics, international coalition formation, and a lot of stuff related to climate change. Here's my professional profile.

replacing the SGD with something that takes the shortest and not the steepest path

Maybe we can design a local search strategy similar to gradient descent which does try to stay close to the initial point x0? E.g., if at x, go a small step into a direction that has the minimal scalar product with x – x0 among those that have at most an angle of alpha with the current gradient, where alpha>0 is a hyperparameter. One might call this "stochastic cone descent" if it does not yet have a name.

Definition 4: Expectation w.r.t. a Set of Sa-Measures

This definition is obviously motivated by the plan to later apply some version of maximin rule, so that only the inf matters.

I suggest that we also study versions what employ other decision-under-ambiguity rules such as Hurwicz' rule or Savage's minimax regret rule.

From my reading of quantilizers, they might still choose "near-optimal" actions, just only with a small probability. Whereas a system based on decision transformers (possibly combined with a LLM) could be designed that we could then simply tell to "make me a tea of this quantity and quality within this time and with this probability" and it would attempt to do just that, without trying to make more or better tea or faster or with higher probability.

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments