This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Instrumental Convergence
•
Applied to
WSJ: The internet systematically Goodharts the human brain
by
Trevor1
at
10d
•
Applied to
Assessing the Capabilities of ChatGPT through Success Rates
by
Repetitive Experimenter
at
2mo
•
Applied to
The Opportunity and Risks of Learning Human Values In-Context
by
Repetitive Experimenter
at
2mo
•
Applied to
You can still fetch the coffee today if you're dead tomorrow
by
Ruben Bloom
at
2mo
•
Applied to
Instrumental convergence is what makes general intelligence possible
by
Raymond Arnold
at
3mo
•
Applied to
[ASoT] Instrumental convergence is useful
by
Raymond Arnold
at
3mo
•
Applied to
POWERplay: An open-source toolchain to study AI power-seeking
by
Edouard Harris
at
3mo
•
Applied to
Empowerment is (almost) All We Need
by
jacob_cannell
at
3mo
•
Applied to
Instrumental convergence: scale and physical interactions
by
Edouard Harris
at
4mo
•
Applied to
Misalignment-by-default in multi-agent systems
by
Edouard Harris
at
4mo
•
Applied to
Instrumental convergence in single-agent systems
by
Edouard Harris
at
4mo
•
Applied to
Deceptive Alignment
by
Noosphere89
at
4mo
•
Applied to
You are Underestimating The Likelihood That Convergent Instrumental Subgoals Lead to Aligned AGI
by
Noosphere89
at
4mo
•
Applied to
Why are we sure that AI will "want" something?
by
Raymond Arnold
at
4mo
•
Applied to
Deliberation, Reactions, and Control: Tentative Definitions and a Restatement of Instrumental Convergence
by
Oliver Sourbut
at
5mo
•
Applied to
Active Inference as a formalisation of instrumental convergence
by
RobertM
at
6mo
•
Applied to
A Critique of AI Alignment Pessimism
by
RobertM
at
6mo
•
Applied to
Circumventing interpretability: How to defeat mind-readers
by
Lee Sharkey
at
7mo