The Pointers Problem - History - AI Alignment Forum

•

Applied to The Pointer Resolution Problem by Arun Jose 2mo ago

Johannes C. Mayer v1.5.0Dec 22nd 2023 (+538/-86)

Consider an agent with a model of the world W. How does W relate to the real world. W might contain a chair. In order for W to be useful it needs to map to reality, i.e. there is a function f with W_chair ↦ R_chair.

The pointers problem ~~refers~~ is about figuring out f.

In John's words (who introduced the concept here):

What functions of what variables (if any) in the environment and/or another world-model correspond to the ~~fact that most humans~~latent variables in the agent’s world-model?

This relates to alignment, as we would ~~rather have~~like an AI that acts based on real-world human values, not just human estimates of their own values – and that the two will be different in many situations, since humans are not all-seeing or all-knowing. ~~It was introduced in~~ ~~a post with the same name~~.Therefore we'd like to figure out how to point to our values directly.

•

Applied to Human sexuality as an interesting case study of alignment by Charles Foster 1y ago

•

Applied to Alignment allows "nonrobust" decision-influences and doesn't require robust grading by Alex Turner 1y ago

•

Applied to Don't align agents to evaluations of plans by Alex Turner 1y ago

•

Applied to Don't design agents which exploit adversarial inputs by Alex Turner 1y ago

•

Applied to People care about each other even though they have imperfect motivational pointers? by Raymond Arnold 1y ago

Noosphere89 v1.4.0Aug 6th 2022 (+9/-25)

The pointers problem refers to the fact that most humans would rather have an AI that acts based on real-world human values, not just human estimates of their own values – and that the two will be different in many situations, since humans are not all-seeing or all-~~knowing~~[citation needed].knowing. It was introduced in a post with the same name.

•

Applied to The Pointers Problem: Clarifications/Variations by Linda Linsefors 2y ago

•

Applied to The pointers problem, distilled by Nina Rimsky 2y ago

•

Applied to Updating Utility Functions by JustinShovelain 2y ago

•

Applied to [Intro to brain-like-AGI safety] 9. Takeaways from neuro 2/2: On AGI motivation by Steve Byrnes 2y ago

v1.1.0Dec 10th 2020 (+17671) Added a brief description.

The pointers problem refers to the fact that most humans would rather have an AI that acts based on real-world human values, not just human estimates of their own values – and that the two will be different in many situations, since humans are not all-seeing or all-knowing[citation needed].It was introduced in a post with the same name.

•

Applied to Robust Delegation by Abram Demski 3y ago

•

Applied to Stable Pointers to Value: An Agent Embedded in Its Own Utility Function by Abram Demski 3y ago

•

Applied to Stable Pointers to Value III: Recursive Quantilization by Abram Demski 3y ago

•

Applied to Stable Pointers to Value II: Environmental Goals by Abram Demski 3y ago

•

Applied to The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables by Abram Demski 3y ago