x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Paul Colognese
Personal website
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
Paul Colognese — AI Alignment Forum
2
Paul Colognese's Shortform
3y
0
30
High-level interpretability: detecting an AI's objectives
2y
0
16
Aligned AI via monitoring objectives in AutoGPT-like systems
3y
0
34
Decision Transformer Interpretability
3y
6
18
Auditing games for high-level interpretability
3y
0
15
Deception?! I ain’t got time for that!
3y
0
Comments