This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Outer Alignment
•
Applied to
Formalizing «Boundaries» with Markov blankets + Criticism of this approach
by
Chipmonk
12d
ago
•
Applied to
A Case for AI Safety via Law
by
JWJohnston
12d
ago
•
Applied to
Recreating the caring drive
by
Catnee
25d
ago
•
Applied to
You can't fetch the coffee if you're dead: an AI dilemma
by
hennyge
1mo
ago
•
Applied to
Democratic Fine-Tuning
by
Joe Edelman
1mo
ago
•
Applied to
Enhancing Corrigibility in AI Systems through Robust Feedback Loops
by
Justausername
1mo
ago
•
Applied to
Embedding Ethical Priors into AI Systems: A Bayesian Approach
by
Justausername
2mo
ago
•
Applied to
Is there any existing term summarizing non-scalable oversight methods in outer alignment?
by
Allen Shen
2mo
ago
•
Applied to
Preference Aggregation as Bayesian Inference
by
Beren Millidge
2mo
ago
•
Applied to
Autonomous Alignment Oversight Framework (AAOF)
by
Justausername
2mo
ago
•
Applied to
Supplementary Alignment Insights Through a Highly Controlled Shutdown Incentive
by
Justausername
2mo
ago
•
Applied to
Simple alignment plan that maybe works
by
Iknownothing
2mo
ago
•
Applied to
Task decomposition for scalable oversight (AGISF Distillation)
by
Charbel-Raphael Segerie
3mo
ago
•
Applied to
[Linkpost] Introducing Superalignment
by
Beren Millidge
3mo
ago
•
Applied to
Slaying the Hydra: toward a new game board for AI
by
Prometheus
3mo
ago
•
Applied to
A Multidisciplinary Approach to Alignment (MATA) and Archetypal Transfer Learning (ATL)
by
Miguel de Guzman
3mo
ago
•
Applied to
Partial Simulation Extrapolation: A Proposal for Building Safer Simulators
by
marc/er
3mo
ago
•
Applied to
Why "AI alignment" would better be renamed into "Artificial Intention research"
by
chaosmage
4mo
ago