This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Mikita Balesni
Posts
Sorted by New
24
Understanding strategic deception and deceptive alignment
6d
0
57
Paper: LLMs trained on “A is B” fail to learn “B is A”
7d
0
38
Paper: On measuring situational awareness in LLMs
1mo
13
86
Announcing Apollo Research
4mo
4
Wiki Contributions
Comments