x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Paul Colognese — AI Alignment Forum
Paul Colognese
Personal website
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
2
Paul Colognese's Shortform
3y
0
30
High-level interpretability: detecting an AI's objectives
2y
0
16
Aligned AI via monitoring objectives in AutoGPT-like systems
3y
0
34
Decision Transformer Interpretability
3y
6
18
Auditing games for high-level interpretability
3y
0
15
Deception?! I ain’t got time for that!
3y
0
Comments