This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
Tags
AF
Login
Iterated Amplification
•
Applied to
Interpretability’s Alignment-Solving Potential: Analysis of 7 Scenarios
by
Evan R. Murphy
at
9d
•
Applied to
HCH and Adversarial Questions
by
Ruben Bloom
at
3mo
•
Applied to
My Overview of the AI Alignment Landscape: A Bird's Eye View
by
Neel Nanda
at
5mo
•
Applied to
Is iterated amplification really more powerful than imitation?
by
Chantiel
at
9mo
•
Applied to
Garrabrant and Shah on human modeling in AGI
by
Rob Bensinger
at
10mo
•
Applied to
Thoughts on Iterated Distillation and Amplification
by
Waddington
at
1y
•
Applied to
Mapping the Conceptual Territory in AI Existential Safety and Alignment
by
Jack Koch
at
1y
•
Applied to
Three AI Safety Related Ideas
by
Joe_Collman
at
1y
•
Applied to
Imitative Generalisation (AKA 'Learning the Prior')
by
Beth Barnes
at
1y
•
Applied to
Debate update: Obfuscated arguments problem
by
Beth Barnes
at
1y
•
Applied to
Meta-execution
by
niplav
at
2y
•
Applied to
Security amplification
by
niplav
at
2y
•
Applied to
Reliability amplification
by
niplav
at
2y
•
Applied to
Techniques for optimizing worst-case performance
by
niplav
at
2y
•
Applied to
Model splintering: moving from one imperfect model to another
by
Jérémy Perret
at
2y
•
Applied to
Directions and desiderata for AI alignment
by
Jérémy Perret
at
2y
•
Applied to
What are the differences between all the iterative/recursive approaches to AI alignment?
by
Jérémy Perret
at
2y