AI ALIGNMENT FORUM
AF

Wikitags

Subagents

Edited by Kaj_Sotala last updated 29th Jul 2020

Subagents refers to the idea that rather than thinking of the mind as an entity with one set of goals and beliefs, it includes many independently acting components, each of which might have varying goals and beliefs. One intuitive way of expressing this is the expression "one part of me wants X, but another part of me wants Y instead".

While the name implies some degree of independent agency on part of the subagents, they may also be viewed as being more passive entities. For example, the "parts" in the above example may be considered different sets of beliefs, accessed one at a time by the same system.

The Multiagent Models of Mind sequence explores the notion of subagents in detail. Akrasia (acting against one's better judgment, such as by procrastinating) may involve subagent disagreement. Internal Double Crux is one technique for getting subagents to agree with each other.

Subscribe
2
Subscribe
2
Discussion0
Discussion0
Posts tagged Subagents
47Why Subagents?
johnswentworth
6y
12
17Multi-agent predictive minds and AI alignment
Jan_Kulveit
7y
0
17Quick thoughts on the implications of multi-agent views of mind on AI takeover
Kaj_Sotala
2y
1
54Embedded Agency (full-text version)
Scott Garrabrant, abramdemski
7y
4
45Shard Theory: An Overview
David Udell
3y
2
49The self-unalignment problem
Jan_Kulveit, rosehadshar
2y
8
43Reward Is Not Enough
Steven Byrnes
4y
12
44Hierarchical Agency: A Missing Piece in AI Alignment
Jan_Kulveit
9mo
2
31Announcing the Alignment of Complex Systems Research Group
Jan_Kulveit, technicalities
3y
11
18Game Theory without Argmax [Part 1]
Cleo Nardo
2y
1
22Subagents of Cartesian Frames
Scott Garrabrant
5y
4
18Embedded Agency via Abstraction
johnswentworth
6y
16
16Wildfire of strategicness
TsviBT
2y
15
17Eight Definitions of Observability
Scott Garrabrant
5y
26
15Committing, Assuming, Externalizing, and Internalizing
Scott Garrabrant
5y
25
Load More (15/18)
Add Posts