x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
AI ALIGNMENT FORUM
AF
Login
Bartosz Cywiński — AI Alignment Forum
Bartosz Cywiński
MATS 8.0 scholar with Arthur Conmy and Sam Marks
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
16
Can we interpret latent reasoning using current mechanistic interpretability tools?
20d
0
18
Current LLMs seem to rarely detect CoT tampering
2mo
0
27
Eliciting secret knowledge from language models
3mo
0
Comments