AI ALIGNMENT FORUM
AF

893
Wikitags

AI Auditing

Edited by Raemon last updated 4th Aug 2025

Formerly "auditing games"

Subscribe
Discussion
Subscribe
Discussion
Posts tagged AI Auditing
43Automating Auditing: An ambitious concrete technical research proposal
evhub
4y
11
76A transparency and interpretability tech tree
evhub
3y
10
82Auditing language models for hidden objectives
Sam Marks, Johannes Treutlein, dmz, Sam Bowman, Hoagy, Carson Denison, Kei Nishimura-Gasparian, 7vik, Akbir Khan, Austin Meek, Euan Ong, Christopher Olah, Fabien Roger, jeanne_, Meg, Drake Thomas, Adam Jermyn, Monte M, evhub
7mo
3
59Towards Alignment Auditing as a Numbers-Go-Up Science
Sam Marks
2mo
8
31Putting up Bumpers
Sam Bowman
6mo
7
22What progress have we made on automated auditing?
Q
LawrenceC
1y
Q
0
18Auditing games for high-level interpretability
Paul Colognese
3y
0
Add Posts