This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Corrigibility
•
Applied to
Extending the Off-Switch Game: Toward a Robust Framework for AI Corrigibility
by
Raymond Arnold
9d
ago
•
Applied to
A Shutdown Problem Proposal
by
Mateusz Bagiński
3mo
ago
•
Applied to
Simplifying Corrigibility – Subagent Corrigibility Is Not Anti-Natural
by
RobertM
3mo
ago
•
Applied to
Towards shutdownable agents via stochastic choice
by
Elliott Thornley
3mo
ago
•
Applied to
Corrigibility = Tool-ness?
by
Tobias D.
3mo
ago
•
Applied to
4. Existing Writing on Corrigibility
by
Max Harms
4mo
ago
•
Applied to
3b. Formal (Faux) Corrigibility
by
Max Harms
4mo
ago
•
Applied to
3a. Towards Formal Corrigibility
by
Max Harms
4mo
ago
•
Applied to
2. Corrigibility Intuition
by
Max Harms
4mo
ago
•
Applied to
Corrigibility could make things worse
by
ThomasCederborg
4mo
ago
•
Applied to
5. Open Corrigibility Questions
by
Ruben Bloom
4mo
ago
•
Applied to
0. CAST: Corrigibility as Singular Target
by
Max Harms
4mo
ago
•
Applied to
1. The CAST Strategy
by
Max Harms
4mo
ago
•
Applied to
The Shutdown Problem: Incomplete Preferences as a Solution
by
Elliott Thornley
7mo
ago
•
Applied to
Requirements for a Basin of Attraction to Alignment
by
Roger Dearnaley
8mo
ago
•
Applied to
Nash Bargaining between Subagents doesn't solve the Shutdown Problem
by
A.H.
8mo
ago
•
Applied to
Requirements for a STEM-capable AGI Value Learner (my Case for Less Doom)
by
Roger Dearnaley
9mo
ago
•
Applied to
A Pedagogical Guide to Corrigibility
by
A.H.
9mo
ago