Generallyer — AI Alignment Forum

Multiscale agency, self-misalignment, and ecological basins of attraction? This sounds really excellent and targets a lot of the conceptual holes I worry about in existing approaches. I look forward to the work that comes out of this!!

I was reminded of a couple different resources you may or may not already be aware of.

For 'vertical' game theory, check out Jules' Hedges work on open/compositional games. https://arxiv.org/search/cs?searchtype=author&query=Hedges%2C+J

For aggregative alignment, there's an interesting literature on the topology of social choice, treating things like Arrow's voting theorem as a description of holes in the space of preferences. https://t.co/8HEpSu0SoE There's something cool going on where partially-overlapping locally-linear rankings can have much stranger global structures. I'm also reminded of this post comment, on the possible virtues of self-misalignment. https://www.lesswrong.com/posts/Di4bFP7kjoLEQLpQd/what-s-the-relationship-between-human-values-and-the-brain-s?commentId=zDt5auxfDAhcHktGm&s=09

AI ALIGNMENT FORUM
AF

AI ALIGNMENT FORUM
AF

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments