Alex_Altair — AI Alignment Forum

Towards a formalization of the agent structure problem

In Clarifying the Agent-Like Structure Problem (2022), John Wentworth describes a hypothetical instance of what he calls a selection theorem. In Scott Garrabrant's words, the question is, does agent-like behavior imply agent-like architecture? That is, if we take some class of behaving things and apply a filter for agent-like behavior,...

Apr 29, 202456

Why don't quantilizers also cut off the upper end of the distribution?

It seems to me that the main goal of quantilization is to reduce the extreme unintended outcomes of maximizing (by sampling from something like a human-learned distribution over actions) while still remaining competitive (by sampling from only the upper quantile of said distribution). But that still leaves open the possibility...

May 15, 202325

Alex_Altair's Shortform

Nov 27, 20227