This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Paul Colognese
Personal website
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
2
Paul Colognese's Shortform
3y
0
30
High-level interpretability: detecting an AI's objectives
2y
0
16
Aligned AI via monitoring objectives in AutoGPT-like systems
2y
0
34
Decision Transformer Interpretability
3y
6
18
Auditing games for high-level interpretability
3y
0
15
Deception?! I ain’t got time for that!
3y
0
Comments