AI ALIGNMENT FORUM
AF

Paul Colognese
Ω80300
Message
Dialogue
Subscribe

Personal website

Posts

Sorted by New
2Paul Colognese's Shortform
2y
0
30High-level interpretability: detecting an AI's objectives
2y
0
16Aligned AI via monitoring objectives in AutoGPT-like systems
2y
0
34Decision Transformer Interpretability
2y
6
18Auditing games for high-level interpretability
3y
0
15Deception?! I ain’t got time for that!
3y
0

Wikitag Contributions

No wikitag contributions to display.

Comments

Sorted by
Newest
No Comments Found