AI ALIGNMENT FORUM
AF

Wikitags

Tiling Agents

Edited by markov, Mateusz Bagiński, et al. last updated 16th Jul 2024

An agent might have the ability to create similar or slightly better versions of itself. These new agents can in turn create similar / better versions of themselves, and so on in a repeating pattern. This is referred to as an agent tiling itself.

This leads to the question: How can the original agent trust that these recursively generated agents maintain goals that are similar to the original agent's objective?

In a deterministic logical system, assuming that all agents will share the same axioms, "trust" arises from being able to formally prove that the conclusions reached by any subsequently generated agents will be true. The possibility to be able to have this form of trust is influenced by Löb's theorem. The inability to form this trust is called the Löbian obstacle.

See Also: Löbian obstacle, Löbs theorem, Vingean Agents, Vingean Reflection

References :

  • intelligence.org/files/TilingAgents.pdf
Subscribe
2
Subscribe
2
Discussion0
Discussion0
Posts tagged Tiling Agents
8Probabilistic Tiling (Preliminary Attempt)
Diffractor
7y
8
3Logical Inductor Tiling and Why it's Hard
Diffractor
7y
0
32Seeking Collaborators
abramdemski
10mo
8
16The alignment stability problem
Seth Herd
2y
3
4Paraconsistent Tiling Agents (Very Early Draft)
IAFF-User-4
10y
5
42The Pando Problem: Rethinking AI Individuality
Jan_Kulveit
5mo
4
35Working through a small tiling result
James Payor
4mo
6
2Vingean Reflection: Open Problems
abramdemski
10y
2
Add Posts