AI ALIGNMENT FORUM
AF

81
Paul Colognese
Ω80300
Message
Dialogue
Subscribe

Personal website

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
No Comments Found
2Paul Colognese's Shortform
3y
0
30High-level interpretability: detecting an AI's objectives
2y
0
16Aligned AI via monitoring objectives in AutoGPT-like systems
2y
0
34Decision Transformer Interpretability
3y
6
18Auditing games for high-level interpretability
3y
0
15Deception?! I ain’t got time for that!
3y
0