AI ALIGNMENT FORUM
AF

Mikita Balesni
000
Message
Subscribe to posts

Posts

Sorted by New
24Understanding strategic deception and deceptive alignment
6d
0
57Paper: LLMs trained on “A is B” fail to learn “B is A”
7d
0
38Paper: On measuring situational awareness in LLMs
1mo
13
86Announcing Apollo Research
4mo
4

Wiki Contributions

No wiki contributions to display.

Comments

No Comments Found