x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Bartosz Cywiński — AI Alignment Forum
Bartosz Cywiński
MATS 8.0 scholar with Arthur Conmy and Sam Marks
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
16
Current LLMs seem to rarely detect CoT tampering
10d
0
27
Eliciting secret knowledge from language models
2mo
0
Comments