AI ALIGNMENT FORUMTags
AF

The Pointers Problem

EditHistory
Discussion (0)
Help improve this page (1 flag)
EditHistory
Discussion (0)
Help improve this page (1 flag)
The Pointers Problem
Random Tag
Contributors
1Noosphere89
1

The pointers problem refers to the fact that most humans would rather have an AI that acts based on real-world human values, not just human estimates of their own values – and that the two will be different in many situations, since humans are not all-seeing or all-knowing. It was introduced in a post with the same name.

Posts tagged The Pointers Problem
6
51The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
johnswentworth
3y
32
3
31Don't design agents which exploit adversarial inputs
Alex Turner, Garrett Baker
7mo
27
2
33Robust Delegation
Abram Demski, Scott Garrabrant
5y
2
2
28Alignment allows "nonrobust" decision-influences and doesn't require robust grading
Alex Turner
6mo
31
2
25Don't align agents to evaluations of plans
Alex Turner
6mo
28
2
16[Intro to brain-like-AGI safety] 9. Takeaways from neuro 2/2: On AGI motivation
Steve Byrnes
1y
9
1
18People care about each other even though they have imperfect motivational pointers?
Alex Turner
7mo
3
2
8Stable Pointers to Value III: Recursive Quantilization
Abram Demski
5y
0
2
12Stable Pointers to Value II: Environmental Goals
Abram Demski
5y
0
2
9Stable Pointers to Value: An Agent Embedded in Its Own Utility Function
Abram Demski
6y
9
1
29The Pointers Problem: Clarifications/Variations
Abram Demski
2y
14
0
11Updating Utility Functions
JustinShovelain, Joar Skalse
1y
0
Add Posts